Sign In

Token Bucket Rate Limiter Visualizer

Interactive visualization of the Token Bucket rate limiting algorithm. Simulate API requests, bursts, and 429 Too Many Requests rejections.

Token Bucket Visualizer

Tokens Available10/10

Request Log

Live
No requests sent yet. Click "Send Request" to begin.

Engine Configuration

10

Maximum burst size (tokens).

1 / sec

Tokens added per second. Represents sustained limit.

1

Tokens consumed per incoming request.

How it works:

The bucket holds a maximum of 10 tokens. Every second, 1 tokens are added back. Each request costs 1 token(s). If the bucket doesn't have enough tokens, the request is rejected with a 429 Too Many Requests status.

Understanding the Token Bucket Algorithm

In distributed systems, protecting your API from abuse, DDoS attacks, or runaway client scripts is paramount. Rate limiting is the standard defense mechanism used to restrict how many requests a single client (usually identified by IP address or API key) can make within a specific time window.

There are several algorithms for rate limiting—Fixed Window, Sliding Window Log, and Leaky Bucket—but the Token Bucket algorithm is arguably the most widely used in modern cloud infrastructure (including AWS API Gateway and Stripe). Its popularity stems from its ability to handle sudden "bursts" of traffic while maintaining a steady, predictable long-term average rate.

How the Token Bucket Works

The algorithm is conceptually simple and requires minimal memory footprint per user, making it highly efficient to implement in fast in-memory stores like Redis.

  • The Bucket: Imagine a bucket that has a fixed maximum capacity (e.g., 10 tokens). It can never hold more than this amount.
  • The Refill Rate: An independent process (or a mathematical time-delta calculation) adds new tokens to the bucket at a constant rate (e.g., 1 token per second). If the bucket is already full, newly generated tokens simply overflow and are discarded.
  • The Request: When a user makes an API request, the system checks the bucket. If there is at least 1 token, the token is removed, and the request is processed (HTTP 200 OK). If the bucket is empty, the request is immediately rejected (HTTP 429 Too Many Requests).

Handling Traffic Bursts

The primary advantage of the Token Bucket algorithm over a standard Fixed Window is burst tolerance. If a user does not make any requests for 10 seconds, their bucket fills up to the maximum capacity of 10. They can then fire off 10 rapid-fire requests in a single millisecond, and all 10 will succeed instantly.

This closely mirrors real-world user behavior, where a user might load a dashboard that triggers 5 concurrent API calls, followed by seconds of inactivity while they read the screen. A rigid Fixed Window algorithm might reject the 5 concurrent calls, whereas the Token Bucket absorbs them gracefully while still enforcing the long-term limit of 1 request per second.

Implementation in Redis

While our visualizer runs the algorithm in the browser using React state, production systems typically implement this in Redis using atomic Lua scripts or the INCR command combined with expirations.

Instead of a literal background loop adding tokens every second (which would be computationally expensive for millions of users), a Redis implementation calculates the time delta since the user's last request. It mathematically derives how many tokens should have been refilled during that time gap, adds them to the bucket (up to the capacity limit), subtracts the cost of the current request, and saves the new token count and timestamp back to Redis.

Related Tools