What is a rate limiter?
A rate limiter is used to control the rate of traffic sent by a client or a service.
What is the benefit of API rate limiter?
- Prevent resource starvation caused by Denial of Service (DoS) attack.
- Reduce cost.
- Prevent servers from being overloaded.
Design questions
- Client-side rate limiter or server-side rate limiter.
- Implement as middleware or in server application code. API gateway is a fully managed service that usually supports rate limiting, SSL termination, authentication, IP whitelisting, servicing static content etc.
- Throttle API requests based on IP, user ID or something else
- What is the scale of requests
- Will the limiter work in a distributed environment
- Should it be implemented as a separated service or in the application code.
- Do we need to inform users who are throttled?
Algorithms for rate limiting
Option1: Token bucket algorithm (Amazon, Stripe uses it)
- A token bucket is a container that has pre-defined capacity, tokens are refilled at a certain rate, each coming request will take one token, if there is not enough token, the request is dropped.
- Pros: Easy to implement, Memory efficient, Allow a burst of traffic for a short period.
- Cons: The parameters, bucket size and refill rate might be hard to tune.
Options2: Leaking bucket algorithm
- The leaking bucket is usually implemented as a FIFO queue with a predefined capacity. Requests are processed at a fixed rate and if the queue is full the request is dropped.
- Pros: Memory efficient. Server can process requests at a stable rate
- Cons: A burst of traffic will fill the queue with old requests and throttle new requests. The parameters, bucket size and process rate might be hard to tune.
Option3: Fixed window counter algorithm
- The algorithm dividends the timeline into a fixed-size time window and assigns a counter for each window.
- Pros: Memory efficient. Easy to understand. Fit the use case where we want quota to be reset at each unit time.
- Cons: Spike in traffic at the edges could cause more requests to go through.
Option4: Sliding window log algorithm
- Record timestamps of the request, remove old requests outside the sliding window, and throttle based on the current request number in the sliding window.
- Pros: The rate limit is controlled more accurately.
- Cons: Needs to store timestamp of requests
Option5: Sliding window counter algorithm
- Based on the fixed window counter algorithm, estimate the rate in the rolling window by aggregating counters number.
- Pros: It smooths out spikes in traffic. Memory efficient.
- Cons: The rate limiter is not strictly accurate.
How to set rate limit rules?
Rate limit rule can be written in YAML files. Administrator can set rules and push to each rate limiter.
Rate limiter in a distributed environment
Option1: use lock to avoid race conditions
- Cons: too costly
Option2: Use sticky session to router a client’s traffic to the same rate limiter
- Cons: not scalable and flexible enough
Option3: Centralize data storage of different rate limiter to Redis