Rate Limiter

The Rate Limiter provides per-user request throttling using the token bucket algorithm. It’s designed to protect API endpoints from abuse while maintaining good user experience through gradual token refill.

Purpose

Rate limiting prevents users from overwhelming the system with too many requests. Unlike simple fixed-window counters, the token bucket algorithm allows for burst traffic while maintaining average rate limits over time.

How It Works

Token Bucket Algorithm

Each user gets a virtual “bucket” of tokens stored in Redis:

Capacity: Maximum tokens the bucket can hold
Refill Rate: Tokens added per second
Cost: Tokens consumed per request
TTL: Bucket expiration time (for cleanup of inactive users)

When a request arrives, the system checks if enough tokens are available. If yes, tokens are consumed and the request proceeds. If no, the request is rejected with RESOURCE_EXHAUSTED status.

Tokens gradually refill over time based on the refill rate, allowing users to make requests again after waiting.

use shield::rpc::middleware::{RateLimitLayer, RateLimitConfig};
use std::time::Duration;

let rate_limit = RateLimitLayer::new(
    redis_connection,
    RateLimitConfig {
        api_name: "CreateOrder",
        capacity: 10.0,
        refill: 1.0,  // 1 token per second
        cost: 1.0,
        ttl: Duration::from_secs(3600),
    }
);

let service = OrderServiceServer::new(order_service)
    .layer(rate_limit);

Configuration Parameters

api_name: Unique identifier for the rate limit bucket (used in Redis key)
capacity: Maximum burst size (how many requests can be made immediately)
refill: Token recovery rate in tokens per second
cost: Tokens consumed per request (usually 1.0, but can be higher for expensive operations)
ttl: How long to keep inactive buckets in Redis

Tuning Guidelines

High-frequency, lightweight endpoints:

Capacity: 30-100
Refill: 10-30 tokens/second
Cost: 1.0

Moderate-frequency endpoints:

Capacity: 10-20
Refill: 1-5 tokens/second
Cost: 1.0

Expensive operations (e.g., report generation):

Capacity: 3-5
Refill: 0.1-0.5 tokens/second
Cost: 1.0

Architecture

Redis-Backed State

Rate limit state is stored in Redis hash structures with two fields:

tokens: Current token count (float)
last_refill: Last refill timestamp (unix timestamp)

A Lua script executes atomically on Redis to:

Calculate elapsed time since last refill
Add refilled tokens (capped at capacity)
Check if enough tokens available
Consume tokens if allowed
Update state and TTL

This ensures thread-safe, distributed rate limiting across multiple server instances.

Authentication Dependency

The middleware requires UserId to be present in the request extensions, which is set by the authentication middleware. Therefore:

Rate limiting middleware must be applied after authentication middleware in the layer stack
Unauthenticated requests will be rejected before rate limiting is evaluated
Anonymous/public endpoints cannot use this middleware (use hashcash CAPTCHA instead)

Error Handling

When rate limit is exceeded, the middleware returns:

Status: RESOURCE_EXHAUSTED
Message: "Rate limit exceeded"

Frontend should handle this gracefully by:

Displaying user-friendly messages
Implementing exponential backoff for retries
Showing remaining quota if applicable (requires separate API)

When to Use Rate Limiting

Good use cases:

Order creation and payment endpoints
Data export and report generation
Account modification operations
Support ticket creation

When NOT to use:

Read-only queries (unless very expensive)
Authentication endpoints (use hashcash instead)
Static content delivery
Public announcement viewing

Implementation Notes

Uses the Processor pattern for the core rate limiting logic
Middleware integrates with Tower/tonic service stack
All Redis operations are atomic via Lua scripts
Float precision is used for tokens to support fractional costs and refill rates
TTL prevents memory leaks from abandoned user sessions

Helium Documentation