Agents Playbook
Pillars/Security

Rate Limiting + DDoS Protection Pattern

How to absorb traffic spikes — both benign (popular feature) and malicious (DDoS / abuse) — without breaking for legitimate users.

Rate Limiting + DDoS Protection Pattern

How to absorb traffic spikes — both benign (popular feature) and malicious (DDoS / abuse) — without breaking for legitimate users.

TL;DR (human)

Multi-layer defense: edge (CDN / WAF) absorbs L3/L4 + bulk L7; per-route rate limits stop per-user abuse; per-tenant quotas stop noisy-neighbor. Different limits for unauth vs auth, for sensitive routes (login, signup, reset). Tokens-bucket / sliding-window algorithms; consistent across services via shared store.

For agents

Defense layers

LayerDefends againstTool
L3/L4 (network)Volumetric attacks (SYN flood, UDP flood)Cloud provider's DDoS service (CloudFront, Cloudflare, AWS Shield)
L7 (HTTP) — bulkHigh-volume HTTP floodsWAF + CDN rate limit (Cloudflare, Akamai)
Per-route rate limitPer-user / per-IP abuseApplication middleware (Redis-backed)
Per-tenant quotaNoisy neighborApplication logic (per multi-tenant-isolation-pattern.md)
Backend circuit breakersCascading failureLibrary (Polly, resilience4j)

Edge absorbs volume cheaply. Application layer handles per-identity logic. Both required.

Rate-limit algorithms

AlgorithmMechanicProsCons
Fixed windowCounter per N secondsSimpleBurst at window boundary
Sliding logTimestamps; remove oldAccurateMemory cost; slow
Sliding window counterTwo windows; weightedGood balanceApproximation
Token bucketRefill rate + bucket sizeBursty allowed; smooth steadyTunable
Leaky bucketConstant drain rateSmooths burstsNo burst allowed

Default: token bucket for user-facing; leaky bucket for cost-sensitive (downstream-rate-limited).

Configuration discipline

Per route, per identity class, set:

  • Rate: requests per time window.
  • Burst: short-term excess allowed.
  • Identity: by IP, user, tenant, API key.

Examples:

GET /api/users/me:
  per-user: 60/min; burst 10
  per-ip (unauth): 100/min; burst 20

POST /auth/login:
  per-ip: 5/min; burst 0  ← anti-credential-stuffing
  per-username: 5/min     ← anti-targeted-attack

POST /api/llm/complete:
  per-tenant: 100/min (free), 1000/min (pro), 10000/min (enterprise)
  per-user: 30/min

Sensitive routes (login, signup, password reset, OTP, payments) get tighter limits.

Per-identity choice

IdentityUse when
User id (auth)Default for authenticated routes
API keyProgrammatic access
Tenant idMulti-tenant; aggregate across tenant's users
IP addressUnauth + signup + login
Session idPre-auth flows

Combine: a route may have multiple limits (per-user + per-tenant + per-IP); the strictest fires.

IP-only is fragile (NAT, shared corporate networks). Use IP + something else where possible.

Response when limit exceeded

HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1700000000

{ "code": "RATE_LIMITED", "message": "...", "hint": "Try again in 30 seconds." }

Retry-After is honored by good clients; tells them when to retry. Returns code: "RATE_LIMITED" for app-level handling.

Shared store

Per-instance rate limits don't work in horizontal-scale environments. A user hitting different replicas would bypass.

Use a shared store:

  • Redis with atomic ops (INCR + EXPIRE; or Lua scripts for sliding window).
  • DynamoDB with conditional writes.
  • Memcached (less common; eviction can break rate limits).

Cost: one Redis hop per request. Mitigate via:

  • Lazy refill (compute bucket on-demand rather than per-request).
  • Local approximation + periodic sync.

Login + signup specifics

Specific patterns for abuse-prone endpoints:

Login:

  • Per-IP rate: 5-10/min.
  • Per-username rate: 5/min (catches credential stuffing against specific accounts).
  • Per-failed-attempt backoff: exponential.
  • After N failures: CAPTCHA challenge or temp lockout.
  • Notify user on multiple failed attempts.

Signup:

  • Per-IP rate: 3-5/min.
  • Email verification mandatory (no instant access).
  • Phone verification for higher-risk products.
  • Abuse signals (disposable email domain, VPN/Tor, signup pattern) → manual review queue.

Password reset:

  • Per-email rate: 3/hour.
  • Per-IP rate: 10/hour.
  • Never confirm whether email exists (avoid enumeration).
  • Tokens single-use; short-lived (10-30 min); IP-bound where viable.

WAF + bot protection

A managed WAF (Cloudflare, AWS WAF, Akamai) provides:

  • Geo blocking (when needed).
  • Bot detection (signal-based; not perfect).
  • Managed rule sets (OWASP top 10 patterns).
  • Bot challenge (CAPTCHA on suspicion).
  • IP reputation lists.

Configure to block / challenge:

  • Suspicious user-agents.
  • Known bad IP lists.
  • Geographic restrictions (per business needs).

Avoid: blocking entire countries by default — legitimate users get blocked. Geo restrict only when business reason.

CAPTCHA / challenge

When abuse detected, escalate before block:

  • Invisible challenge (Cloudflare Turnstile, hCaptcha, reCAPTCHA v3): scoring; transparent to good users.
  • Visible challenge: image puzzle; user friction.
  • Magic link / 2FA: highest friction; for verified compromise scenarios.

Friction sequence: invisible → visible → block. Match severity.

Backend circuit breakers

When an upstream is failing, fail fast rather than pile up requests:

  • After N consecutive failures: open circuit; immediate failure for all requests.
  • After cool-down: half-open; try one request; reset on success.

Avoids:

  • Thundering herd on recovery.
  • Cascading failure (your service can't catch up; the next one chokes).

Libraries: Polly (.NET), resilience4j (JVM), opossum (Node).

Per-tenant abuse signals

Beyond limits, detect:

  • One tenant generating > Nx the median request rate.
  • One tenant's error rate suddenly spikes (broken integration).
  • One tenant accessing many distinct resources rapidly (scraping signal).
  • Unusual times / patterns (per ../quality/observability-pattern.md).

Each surfaces in a dashboard; on-call has a runbook for "noisy tenant".

Costs

DDoS protection isn't free:

  • CDN traffic for legit users is the bulk; DDoS spike usually doesn't change much for the CDN.
  • WAF rule evaluation adds latency (sub-ms typically).
  • AWS Shield Advanced / Cloudflare Enterprise: tens of K USD / year for full protection.

For most products, the managed-CDN's default DDoS protection is enough. Advanced tier when you're a high-value target.

Common failure modes

  • No rate limit on login. Credential stuffing succeeds. → Per-IP + per-username.
  • Rate limit per-instance. Bypassed by hitting different replicas. → Shared store.
  • Rate limit returns 200 with empty body. Clients retry; defeat purpose. → 429 + Retry-After.
  • No backend circuit breaker. Upstream goes down; your service piles up; goes down too. → Library.
  • CAPTCHA on every login. Friction kills good users. → Scoring; only when suspicious.
  • Geo-block based on bad data. Real users blocked. → Geographic restrictions narrowly + via human decision.
  • Per-IP limits don't account for NAT. A small office blocked because shared IP. → IP + identity hybrid.
  • No alerting on rate-limit fires. Abuse goes unnoticed. → Per-route monitoring; spike alerts.

Tooling stack (typical)

ConcernTool
CDN + edge WAFCloudflare, Fastly, Akamai, AWS CloudFront + WAF
DDoS L3/L4Cloud provider native + AWS Shield Advanced
Application rate-limitexpress-rate-limit (Node), Flask-Limiter, ASP.NET Core RateLimiting, rate-limiter-flexible
CAPTCHACloudflare Turnstile, hCaptcha, reCAPTCHA
Bot detectionDataDome, PerimeterX, Cloudflare Bot Management
Circuit breakerPolly, resilience4j, opossum
Shared storeRedis, DynamoDB

See also