API Rate Limiter – System Design

 

API Rate Limiter – System Design 

1️⃣ Problem Statement

We need an API Rate Limiter to:

  • Protect APIs from abuse

  • Ensure fair usage per tenant/user/route

  • Allow bursty traffic with a steady average

  • Work in Kubernetes + multi-replica + multi-region

  • Make decisions with very low latency

  • Degrade gracefully if Redis or control plane fails


2️⃣ High-Level Idea (One-Line)

Rate limiting is an edge decision problem — keep it fast, local, and predictable.


3️⃣ Where Rate Limiting Happens

Client | v [API Gateway / Ingress] | v [Rate Limiter] | (allow / deny) | v [Backend Service]

Best practice

  • Enforce at the edge first (Gateway / Ingress)

  • Optional: Sidecar / mesh for fine-grained internal APIs


4️⃣ Algorithm Choice (Keep It Simple)

✅ Token Bucket (Recommended)

  • Allows bursts

  • Maintains steady average rate

  • Easy to reason about

❌ Sliding Window

  • More accurate

  • Heavier on storage and compute

  • Usually overkill

👉 Interview tip

“I use token bucket for RPS and fixed window for daily/monthly quotas.”


5️⃣ Storage & Atomicity

Redis + Lua (Authoritative Check)

Key: rl:{tenant}:{route} Value: { tokens_remaining, last_refill_time }

Lua script does atomically:

  • Refill tokens based on time

  • Deduct request cost

  • Return allow/deny + remaining tokens

⚠️ Important Fix

If you use INCR + EXPIRE, that is fixed window, not token bucket.
➡️ Fix: Store tokens + refill timestamp and calculate refill in Lua.


6️⃣ Request Data Flow (Hot Path)

1. Request hits Gateway 2. Extract key (tenant / route / method) 3. Check local in-memory bucket (optional) 4. Redis Lua check (authoritative) 5. Allow → forward 6. Deny → 429 response
Client | v [Gateway] | v [Limiter] ---> Redis (Lua) | +--> 200 OK → Service | +--> 429 Too Many Requests

7️⃣ Multi-Region Strategy (Choose One)

Option A – Home Region (Recommended)

Tenant → fixed region → single Redis cluster
  • Strong fairness

  • Simple reasoning

Option B – Eventual Consistency

Region A Redis ← async merge → Region B Redis
  • Best latency

  • Small temporary overshoot allowed

👉 Interview answer

“If fairness is critical, route tenants to a home region.
If latency matters more, accept small overshoot with eventual consistency.”


8️⃣ Failure Handling (Very Important)

Redis Down

  • Default: Fail-open + local limiter

  • Critical APIs: Fail-closed (payments/admin)

Control Plane Down

  • Use last known policy

  • Alert if policy is stale


9️⃣ Kubernetes Integration (Summary)

OptionUse Case
NGINX IngressSimple IP/path limits
Kong + RedisPer-tenant / header-based limits
Envoy / IstioLocal + global rate limiting
Custom CRDEnterprise policy management

🔟 Rate Limit Response

Always return standard headers:

X-RateLimit-Limit X-RateLimit-Remaining X-RateLimit-Reset Retry-After

Denied response:

HTTP 429 Too Many Requests

1️⃣1️⃣ What to Fix / Improve (Key Section)

✅ Fix 1: Use real Token Bucket

Replace window counters with token + refill timestamp.

✅ Fix 2: Add local limiter

Use in-memory bucket to reduce Redis load and hot keys.

✅ Fix 3: Decide multi-region policy clearly

Don’t mix strong consistency and CRDT casually.

✅ Fix 4: Define fail-open vs fail-closed per endpoint

Availability vs protection trade-off must be explicit.


1️⃣2️⃣ Text Diagram – Complete Flow

Client | v [Ingress / Gateway] | v [Rate Limiter] | | | v | Redis (Lua) | +--> Allow → Service → 200 | +--> Deny → 429

1️⃣3️⃣ One-Minute Interview Explanation

“I enforce rate limiting at the gateway using a token bucket algorithm.
Each request checks a local bucket first, then Redis via a Lua script for atomic refill and decrement.
For multi-region, I either route tenants to a home region for strict fairness or allow small overshoot with eventual consistency.
On Redis failure, I fail-open with local limits by default and fail-closed only for critical APIs.”


✅ Final Outcome

  • Predictable fairness

  • Low-latency decisions

  • Redis protected from overload

  • Clear failure semantics

  • Easy Kubernetes integration



No comments:

Post a Comment

Online Food Delivery Platform — System Design

  Online Food Delivery Platform — System Design  1) Use Case & Problem Context Users should be able to: Browse restaurants near them...

Featured Posts