This content originally appeared on DEV Community and was authored by ZeeshanAli-0704

Designing a Rate Limiter

What is a Rate Limiter?

In a network system, a rate limiter controls the rate of traffic sent by a client or service. In the context of HTTP, a rate limiter restricts the number of client requests that can be sent over a specified period. If the number of API requests exceeds the threshold defined by the rate limiter, the excess calls are blocked. Here are some examples:

A user can write no more than 2 posts per second.
A maximum of 10 accounts can be created per day from the same IP address.
Rewards can be claimed no more than 5 times per week from the same device.

Advantages of an API Rate Limiter

Prevent Resource Starvation

Rate limiters help prevent resource starvation caused by Denial of Service (DoS) attacks. Many large tech companies enforce some form of rate limiting. For example, Twitter limits the number of tweets to 300 per 3 hours, and Google Docs APIs have a default limit of 300 requests per user per 60 seconds for read requests. By blocking excess calls, a rate limiter can prevent DoS attacks, whether intentional or unintentional.

Reduce Costs

Limiting excess requests reduces the number of servers needed and allocates more resources to high-priority APIs. This is crucial for companies using paid third-party APIs, where each call incurs a cost. For instance, checking credit, making payments, or retrieving health records often involves per-call charges, making it essential to limit the number of calls to reduce costs.

Prevent Server Overload

To reduce server load, a rate limiter filters out excess requests caused by bots or user misbehavior, ensuring that the servers are not overwhelmed.

Understanding the Problem and Requirements

Key Considerations

Type of Rate Limiter: We will focus on a server-side API rate limiter.
Throttling Criteria: The rate limiter should be flexible enough to support different sets of throttle rules, such as IP address or user ID.
System Scale: The system must handle a large number of requests.
Distributed Environment: The rate limiter should work in a distributed environment.
Implementation: It can be a separate service or integrated into the application code.
User Notifications: Users should be informed when their requests are throttled.

Step 1: Requirements

Functional and Non-Functional Requirements

Accurate Limiting: Accurately limit excessive requests.
Low Latency: The rate limiter should not slow down HTTP response times.
Memory Efficiency: Use as little memory as possible.
Distributed Limiting: Share the rate limiter across multiple servers or processes.
Exception Handling: Show clear exceptions to users when their requests are throttled.
High Fault Tolerance: Ensure that problems with the rate limiter (e.g., a cache server going offline) do not affect the entire system.

Step 2: High-Level Design (HLD)

Placement of the Rate Limiter

Instead of implementing the rate limiter on the client or server side, we can create a rate limiter middleware that throttles requests to APIs.

Example Scenario

Assume our API allows 2 requests per second. If a client sends 3 requests within a second, the first two requests are routed to API servers, but the third request is throttled by the middleware and returns an HTTP status code 429, indicating too many requests.

API Gateway

In a microservices architecture, rate limiting is often implemented within a component called an API gateway. This is a fully managed service that supports rate limiting, SSL termination, authentication, IP whitelisting, and servicing static content.

Implementation Considerations

Technology Stack: Ensure the current technology stack, including the programming language and cache service, supports efficient rate limiting.
Rate Limiting Algorithm: Choose an algorithm that fits business needs. Server-side implementation provides full control, but third-party gateways might have limitations.
Microservice Architecture: If using an API gateway for authentication and other services, adding a rate limiter there can be beneficial.
Engineering Resources: Building a rate-limiting service requires time and resources. If limited, consider using a commercial API gateway.

High-Level Architecture

Rate limiting algorithms are simple at a high level. A counter keeps track of the number of requests from the same user or IP address. If the counter exceeds the limit, the request is disallowed.

Storage of Counters

Using a database is not ideal due to the slowness of disk access. In-memory cache (e.g., Redis) is preferred for its speed and support for time-based expiration. Redis offers commands like INCR (increment the counter) and EXPIRE (set a timeout for the counter).

Request Flow

The client sends a request to the rate-limiting middleware.
The middleware fetches the counter from Redis and checks if the limit is reached.
If the limit is reached, the request is rejected.
If the limit is not reached, the request is sent to API servers, the counter is incremented, and the updated value is saved back to Redis.

Step 3: Design Deep Dive

Rate Limiting Rules

Rate limiting rules define the conditions under which requests are throttled. These rules are generally written in configuration files and stored on disk. For example, allowing a maximum of 5 marketing messages per day or restricting logins to 5 times per minute.

Handling Exceeded Limits

When a request exceeds the rate limit, the API returns an HTTP response code 429. Depending on the use case, rate-limited requests may be enqueued for later processing.

Rate Limiter Headers

Clients can be informed about rate limits via HTTP response headers:

X-Ratelimit-Remaining: The remaining number of allowed requests.
X-Ratelimit-Limit: The maximum number of requests allowed per time window.
X-Ratelimit-Retry-After: The number of seconds to wait before making another request.

Detailed Design

Rules are stored on the disk and frequently pulled by workers into the cache.
A client request goes to the rate limiter middleware.
The middleware loads rules from the cache and fetches counters from Redis.
Depending on the counter value:
- If the request is not rate limited, it is forwarded to API servers.
- If the request is rate limited, an HTTP 429 error is returned, and the request may be dropped or queued.

Distributed Environment Challenges

Race Condition

Race conditions occur when multiple threads read and write shared data concurrently. For example, two requests reading the counter value before either writes it back can lead to an incorrect count. Solutions include using locks or Redis features like Lua scripts and sorted sets.

Synchronization

In a distributed environment, synchronization ensures that multiple rate limiter servers have consistent data. Sticky sessions or centralized data stores like Redis can help maintain synchronization.

Performance Optimization

Multi-Data Center Setup

To reduce latency for users far from the data center, use edge servers geographically distributed around the world. Cloud service providers like Cloudflare have many edge servers to route traffic to the nearest one.

Eventual Consistency

Synchronize data with an eventual consistency model to handle large-scale systems efficiently.

Rate Limiting Algorithms

Rate limiting can be implemented using different algorithms, each with its pros and cons:

Token Bucket

A fixed number of tokens are added to a bucket at regular intervals. Each request consumes a token, and if the bucket is empty, requests are denied. This algorithm smoothens bursts of traffic.

Leaking Bucket

Requests are added to a bucket at any rate but leak out at a fixed rate. This algorithm ensures a steady flow of requests.

Fixed Window Counter

Requests are counted within fixed time windows (e.g., per minute). If the count exceeds the limit, requests are denied.

Sliding Window Log

Tracks request timestamps in a log and counts requests within a sliding time window, providing smoother limiting compared to fixed windows.

Sliding Window Counter

Combines fixed window counter and sliding window log, providing a balance between simplicity and accuracy.

Rate Limiter Using Sorted Set in Redis

Certainly! Implementing a rate limiter using a Sorted Set in Redis is a robust method, especially in a distributed network. Here’s an in-depth explanation of how it works, along with an example:

How Sorted Sets in Redis Work for Rate Limiting

A Sorted Set in Redis is a collection of unique elements, each associated with a score. The elements are sorted by their scores, which can be any floating-point value. This sorting allows for efficient range queries, which is crucial for implementing a rate limiter.

Key Concepts

Elements: Represent the client’s requests.
Scores: Represent the timestamps of the requests.

Steps to Implement Rate Limiting Using Sorted Sets

Initialize the Sorted Set: For each client (identified by a unique key like IP address or user ID), maintain a Sorted Set in Redis where the elements are request identifiers (could be a unique request ID or just a count) and the scores are the timestamps of the requests.
Adding a Request: When a client makes a request, add an entry to the Sorted Set with the current timestamp as the score.
Trimming Old Requests: Periodically or on each request, remove entries from the Sorted Set that fall outside the allowed time window.
Counting Requests: Count the number of elements in the Sorted Set to check if it exceeds the allowed limit.
Enforcing Limits: If the count exceeds the limit, deny the request. If not, process the request and add the new entry to the set.

Example

Let's say we want to limit a user to 100 requests per hour.

Step-by-Step Implementation

Initialization: No special initialization is needed in Redis. Redis will create the Sorted Set on the first use.
Request Handling:

Add a Request: When a request is made, get the current timestamp.
```
 import time
 current_timestamp = int(time.time())
```

Add to Sorted Set:

 client_id = "user123"  # Example client ID
 request_id = "req456"  # Example request ID (could be unique or a counter)
 redis.zadd(client_id, {request_id: current_timestamp})

Trim Old Requests: Remove requests older than 1 hour (3600 seconds).

 one_hour_ago = current_timestamp - 3600
 redis.zremrangebyscore(client_id, 0, one_hour_ago)

Count Requests: Get the count of requests in the last hour.

 request_count = redis.zcount(client_id, one_hour_ago, current_timestamp)

Enforce Limit: Check if the request count exceeds the limit.

 if request_count >= 100:
     print("Rate limit exceeded")
     # Deny the request
 else:
     print("Request allowed")
     # Process the request

Handling Concurrency

In a distributed environment, multiple servers might handle requests for the same client concurrently. Redis handles the atomicity of commands, ensuring that even with concurrent access, the operations (like adding a request or trimming old requests) are performed safely.

Detailed Example Code

Here's a more detailed code example using Redis with Python and the redis-py library:

import redis
import time

# Connect to Redis
r = redis.StrictRedis(host='localhost', port=6379, db=0)

# Function to handle a request
def handle_request(client_id, request_id, limit, window):
    current_timestamp = int(time.time())
    one_hour_ago = current_timestamp - window

    # Transaction to ensure atomic operations
    with r.pipeline() as pipe:
        # Add the current request
        pipe.zadd(client_id, {request_id: current_timestamp})

        # Remove old requests
        pipe.zremrangebyscore(client_id, 0, one_hour_ago)

        # Count the number of requests in the time window
        pipe.zcount(client_id, one_hour_ago, current_timestamp)

        # Execute the pipeline
        _, _, request_count = pipe.execute()

    # Check if the request count exceeds the limit
    if request_count >= limit:
        return False  # Rate limit exceeded
    else:
        return True  # Request allowed

# Example usage
client_id = "user123"
request_id = "req456"
limit = 100  # Maximum 100 requests
window = 3600  # Time window of 1 hour

if handle_request(client_id, request_id, limit, window):
    print("Request allowed")
else:
    print("Rate limit exceeded")

Benefits of Using Sorted Sets in Redis

Efficiency: Sorted sets in Redis are optimized for quick insertion, deletion, and range queries, which are essential for rate limiting.
Atomic Operations: Redis supports atomic operations, ensuring that concurrent modifications are handled safely.
Scalability: Redis is designed to handle high-throughput scenarios, making it suitable for distributed environments.

By leveraging Redis's sorted sets and atomic operations, you can build a robust and scalable rate limiter that effectively controls the rate of client requests in a distributed network.

Conclusion

Designing an effective rate limiter involves understanding requirements, choosing the right algorithms, and considering performance and synchronization challenges in a distributed environment. Using tools like Redis can simplify implementation and ensure high performance.

This content originally appeared on DEV Community and was authored by ZeeshanAli-0704

Print Share Comment Cite Upload Translate Updates

APA

ZeeshanAli-0704 | Sciencx (2024-08-06T05:10:04+00:00) Design a Rate Limiter. Retrieved from https://www.scien.cx/2024/08/06/design-a-rate-limiter/

MLA

" » Design a Rate Limiter." ZeeshanAli-0704 | Sciencx - Tuesday August 6, 2024, https://www.scien.cx/2024/08/06/design-a-rate-limiter/

HARVARD

ZeeshanAli-0704 | Sciencx Tuesday August 6, 2024 » Design a Rate Limiter., viewed ,<https://www.scien.cx/2024/08/06/design-a-rate-limiter/>

VANCOUVER

ZeeshanAli-0704 | Sciencx - » Design a Rate Limiter. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/08/06/design-a-rate-limiter/

CHICAGO

" » Design a Rate Limiter." ZeeshanAli-0704 | Sciencx - Accessed . https://www.scien.cx/2024/08/06/design-a-rate-limiter/

IEEE

" » Design a Rate Limiter." ZeeshanAli-0704 | Sciencx [Online]. Available: https://www.scien.cx/2024/08/06/design-a-rate-limiter/. [Accessed: ]

rf:citation

» Design a Rate Limiter | ZeeshanAli-0704 | Sciencx | https://www.scien.cx/2024/08/06/design-a-rate-limiter/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

Designing a Rate Limiter

What is a Rate Limiter?

Advantages of an API Rate Limiter

Prevent Resource Starvation

Reduce Costs

Prevent Server Overload

Understanding the Problem and Requirements

Key Considerations

Step 1: Requirements

Functional and Non-Functional Requirements

Step 2: High-Level Design (HLD)

Placement of the Rate Limiter

Example Scenario

API Gateway

Implementation Considerations

High-Level Architecture

Storage of Counters

Request Flow

Step 3: Design Deep Dive

Rate Limiting Rules

Handling Exceeded Limits

Rate Limiter Headers

Detailed Design

Distributed Environment Challenges

Race Condition

Synchronization

Performance Optimization

Multi-Data Center Setup

Eventual Consistency

Rate Limiting Algorithms

Token Bucket

Leaking Bucket

Fixed Window Counter

Sliding Window Log

Sliding Window Counter

Rate Limiter Using Sorted Set in Redis

How Sorted Sets in Redis Work for Rate Limiting

Key Concepts

Steps to Implement Rate Limiting Using Sorted Sets

Example

Step-by-Step Implementation

Handling Concurrency

Detailed Example Code

Benefits of Using Sorted Sets in Redis

Conclusion

Related Posts