Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Rate Limiter

The Rate Limiter provides per-user request throttling using the token bucket algorithm. It’s designed to protect API endpoints from abuse while maintaining good user experience through gradual token refill.

Purpose

Rate limiting prevents users from overwhelming the system with too many requests. Unlike simple fixed-window counters, the token bucket algorithm allows for burst traffic while maintaining average rate limits over time.

How It Works

Token Bucket Algorithm

Each user gets a virtual “bucket” of tokens stored in Redis:

  • Capacity: Maximum tokens the bucket can hold
  • Refill Rate: Tokens added per second
  • Cost: Tokens consumed per request
  • TTL: Bucket expiration time (for cleanup of inactive users)

When a request arrives, the system checks if enough tokens are available. If yes, tokens are consumed and the request proceeds. If no, the request is rejected with RESOURCE_EXHAUSTED status.

Tokens gradually refill over time based on the refill rate, allowing users to make requests again after waiting.

Per-User Enforcement

Rate limits are enforced per-user, identified by their user ID from the authentication middleware. Each API endpoint can have its own rate limit configuration with a unique identifier.

Bucket keys in Redis follow the pattern: rate_limit:API[{api_name}]-{user_id}

Integration

As Middleware

RateLimitLayer is a Tower middleware that wraps individual gRPC service methods. Apply it to specific endpoints that need rate limiting:

use shield::rpc::middleware::{RateLimitLayer, RateLimitConfig};
use std::time::Duration;

let rate_limit = RateLimitLayer::new(
    redis_connection,
    RateLimitConfig {
        api_name: "CreateOrder",
        capacity: 10.0,
        refill: 1.0,  // 1 token per second
        cost: 1.0,
        ttl: Duration::from_secs(3600),
    }
);

let service = OrderServiceServer::new(order_service)
    .layer(rate_limit);

Configuration Parameters

  • api_name: Unique identifier for the rate limit bucket (used in Redis key)
  • capacity: Maximum burst size (how many requests can be made immediately)
  • refill: Token recovery rate in tokens per second
  • cost: Tokens consumed per request (usually 1.0, but can be higher for expensive operations)
  • ttl: How long to keep inactive buckets in Redis

Tuning Guidelines

High-frequency, lightweight endpoints:

  • Capacity: 30-100
  • Refill: 10-30 tokens/second
  • Cost: 1.0

Moderate-frequency endpoints:

  • Capacity: 10-20
  • Refill: 1-5 tokens/second
  • Cost: 1.0

Expensive operations (e.g., report generation):

  • Capacity: 3-5
  • Refill: 0.1-0.5 tokens/second
  • Cost: 1.0

Architecture

Redis-Backed State

Rate limit state is stored in Redis hash structures with two fields:

  • tokens: Current token count (float)
  • last_refill: Last refill timestamp (unix timestamp)

A Lua script executes atomically on Redis to:

  1. Calculate elapsed time since last refill
  2. Add refilled tokens (capped at capacity)
  3. Check if enough tokens available
  4. Consume tokens if allowed
  5. Update state and TTL

This ensures thread-safe, distributed rate limiting across multiple server instances.

Authentication Dependency

The middleware requires UserId to be present in the request extensions, which is set by the authentication middleware. Therefore:

  1. Rate limiting middleware must be applied after authentication middleware in the layer stack
  2. Unauthenticated requests will be rejected before rate limiting is evaluated
  3. Anonymous/public endpoints cannot use this middleware (use hashcash CAPTCHA instead)

Error Handling

When rate limit is exceeded, the middleware returns:

  • Status: RESOURCE_EXHAUSTED
  • Message: "Rate limit exceeded"

Frontend should handle this gracefully by:

  • Displaying user-friendly messages
  • Implementing exponential backoff for retries
  • Showing remaining quota if applicable (requires separate API)

When to Use Rate Limiting

Good use cases:

  • Order creation and payment endpoints
  • Data export and report generation
  • Account modification operations
  • Support ticket creation

When NOT to use:

  • Read-only queries (unless very expensive)
  • Authentication endpoints (use hashcash instead)
  • Static content delivery
  • Public announcement viewing

Implementation Notes

  • Uses the Processor pattern for the core rate limiting logic
  • Middleware integrates with Tower/tonic service stack
  • All Redis operations are atomic via Lua scripts
  • Float precision is used for tokens to support fractional costs and refill rates
  • TTL prevents memory leaks from abandoned user sessions