🎯 Best Practices for Rate Limiting

Choosing the Right Algorithm
Setting Appropriate Limits
Implementation Guidelines
Distributed Systems
User Experience
Monitoring & Alerting
Security Considerations
Testing Strategies

Choosing the Right Algorithm

Decision Framework

┌─────────────────────────────────────┐
│ Need perfect accuracy?              │
│ (e.g., payments, compliance)        │
└────────┬───────────────────┬────────┘
         │ YES               │ NO
         ↓                   ↓
   Sliding Window Log    Token Bucket
                         or Sliding 
                         Window Counter

By Application Type

REST APIs (Public)

Recommended: Sliding Window Counter
Why: Balance of accuracy, performance, memory
Config: 100-1000 req/min per user

WebSocket/Real-time

Recommended: Token Bucket
Why: Handles bursts, fast decisions
Config: Large bucket, high refill rate

Background Jobs

Recommended: Leaky Bucket
Why: Constant processing rate
Config: Match processing capacity

Authentication Endpoints

Recommended: Token Bucket with backoff
Why: Allow retries, prevent brute force
Config: 5-10 attempts per 15 minutes

Payment Processing

Recommended: Sliding Window Log
Why: Perfect accuracy required
Config: Very strict, consider user tier

Setting Appropriate Limits

Capacity Planning

Calculate safe limit:
─────────────────────
1. Measure system capacity (requests/sec)
2. Set limit to 70-80% of capacity
3. Add buffer for spikes
4. Test under load

Example Calculation

// System can handle 10,000 req/sec
const systemCapacity = 10000;

// Use 75% for safety
const safeCapacity = systemCapacity * 0.75; // 7,500

// Per-user limit (1000 users)
const expectedUsers = 1000;
const perUserLimit = safeCapacity / expectedUsers; // 7.5 req/sec

// Round down for safety
const finalLimit = Math.floor(perUserLimit); // 7 req/sec per user

Tiered Limits

# Example tier structure
tiers:
  free:
    requests: 100
    period: hour
    burst: 20
  
  pro:
    requests: 1000
    period: hour
    burst: 100
  
  enterprise:
    requests: 10000
    period: hour
    burst: 500

Time-based Adjustments

// Higher limits during off-peak hours
function getLimit(hour) {
  const peakHours = [9, 10, 11, 12, 13, 14, 15, 16, 17]; // 9am-5pm
  const baseLimit = 100;
  
  return peakHours.includes(hour) 
    ? baseLimit           // Peak: strict
    : baseLimit * 1.5;    // Off-peak: relaxed
}

Implementation Guidelines

1. Start Simple

// ✅ Good - Start with simple implementation
class SimpleRateLimiter {
  constructor(limit, windowMs) {
    this.limit = limit;
    this.windowMs = windowMs;
    this.requests = new Map();
  }
  
  isAllowed(userId) {
    const now = Date.now();
    const userRequests = this.requests.get(userId) || [];
    
    // Remove old requests
    const validRequests = userRequests.filter(
      time => now - time < this.windowMs
    );
    
    if (validRequests.length < this.limit) {
      validRequests.push(now);
      this.requests.set(userId, validRequests);
      return true;
    }
    
    return false;
  }
}

// ❌ Bad - Over-engineering from start
// Don't start with complex distributed system
// unless you actually need it

2. Fail Open vs Fail Closed

// Fail Open (Recommended for non-critical)
try {
  if (rateLimiter.isAllowed(userId)) {
    return processRequest();
  }
  return rejectRequest();
} catch (error) {
  logger.error('Rate limiter error', error);
  return processRequest(); // Allow on error
}

// Fail Closed (For security-critical)
try {
  if (rateLimiter.isAllowed(userId)) {
    return processRequest();
  }
  return rejectRequest();
} catch (error) {
  logger.error('Rate limiter error', error);
  return rejectRequest(); // Reject on error
}

3. Atomic Operations

// ✅ Good - Atomic increment
async function checkRateLimit(userId) {
  const key = `rate_limit:${userId}`;
  const current = await redis.incr(key);
  
  if (current === 1) {
    await redis.expire(key, 60); // Set expiry on first request
  }
  
  return current <= LIMIT;
}

// ❌ Bad - Race condition
async function checkRateLimit(userId) {
  const current = await redis.get(key);
  if (current < LIMIT) {
    await redis.incr(key); // Race condition here!
    return true;
  }
  return false;
}

4. Memory Management

// ✅ Good - Automatic cleanup
class RateLimiter {
  constructor() {
    this.data = new Map();
    
    // Cleanup every minute
    setInterval(() => this.cleanup(), 60000);
  }
  
  cleanup() {
    const now = Date.now();
    for (const [key, value] of this.data.entries()) {
      if (now - value.lastAccess > 3600000) { // 1 hour
        this.data.delete(key);
      }
    }
  }
}

// ❌ Bad - Memory leak
class RateLimiter {
  constructor() {
    this.data = new Map();
    // Never cleaned up!
  }
}

Distributed Systems

Using Redis

// Sliding window with Redis
async function checkRateLimit(userId, limit, windowSec) {
  const key = `rate_limit:${userId}`;
  const now = Date.now();
  const windowStart = now - (windowSec * 1000);
  
  const multi = redis.multi();
  
  // Remove old entries
  multi.zremrangebyscore(key, 0, windowStart);
  
  // Count current requests
  multi.zcard(key);
  
  // Add new request
  multi.zadd(key, now, `${now}-${Math.random()}`);
  
  // Set expiry
  multi.expire(key, windowSec);
  
  const results = await multi.exec();
  const currentCount = results[1][1];
  
  return currentCount < limit;
}

Handling Clock Skew

// Use Redis time instead of local time
async function checkRateLimit(userId) {
  const redisTime = await redis.time(); // [seconds, microseconds]
  const timestamp = redisTime[0] * 1000 + Math.floor(redisTime[1] / 1000);
  
  // Use timestamp for rate limiting logic
  return processWithTimestamp(timestamp);
}

Distributed Consensus

// Lua script for atomic rate limiting
const luaScript = `
local key = KEYS[1]
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local now = tonumber(ARGV[3])

local current = redis.call('INCR', key)
if current == 1 then
    redis.call('EXPIRE', key, window)
end

if current > limit then
    return 0
end

return 1
`;

// Use script
const allowed = await redis.eval(
  luaScript,
  1,
  `rate_limit:${userId}`,
  limit,
  windowSec,
  Date.now()
);

User Experience

Informative Error Responses

// ✅ Good - Helpful error message
{
  "error": "Rate limit exceeded",
  "message": "Too many requests. Please try again later.",
  "retryAfter": 45,  // seconds
  "limit": 100,
  "remaining": 0,
  "reset": 1609459200  // Unix timestamp
}

// ❌ Bad - Cryptic error
{
  "error": "429"
}

Response Headers

// Include rate limit info in headers
app.use((req, res, next) => {
  const limit = req.rateLimit.limit;
  const remaining = req.rateLimit.remaining;
  const reset = req.rateLimit.reset;
  
  res.setHeader('X-RateLimit-Limit', limit);
  res.setHeader('X-RateLimit-Remaining', remaining);
  res.setHeader('X-RateLimit-Reset', reset);
  
  if (remaining === 0) {
    res.setHeader('Retry-After', Math.ceil((reset - Date.now()) / 1000));
  }
  
  next();
});

Graceful Degradation

// Reduce functionality instead of complete block
if (!rateLimiter.isAllowed(userId)) {
  // Still allow access but with reduced features
  return {
    data: getCachedData(),  // Return cached data
    features: ['read-only'],  // Disable writes
    warning: 'Rate limit reached. Some features disabled.'
  };
}

Monitoring & Alerting

Key Metrics

// Metrics to track
const metrics = {
  // Rate limiting specific
  requests_total: counter,
  requests_allowed: counter,
  requests_rejected: counter,
  rejection_rate: gauge,
  
  // Performance
  rate_limit_check_duration: histogram,
  rate_limit_errors: counter,
  
  // User behavior
  unique_users_rate_limited: counter,
  repeated_violations: counter
};

Alert Thresholds

alerts:
  # High rejection rate
  - name: HighRejectionRate
    condition: rejection_rate > 0.3  # 30%
    duration: 5m
    severity: warning
    message: "High rate limit rejection rate"
  
  # Rate limiter errors
  - name: RateLimiterErrors
    condition: rate_limit_errors > 10
    duration: 1m
    severity: critical
    message: "Rate limiter experiencing errors"
  
  # Suspicious activity
  - name: SuspiciousActivity
    condition: repeated_violations > 100
    duration: 5m
    severity: warning
    message: "Possible DDoS or abuse"

Dashboard Example

Rate Limiting Dashboard
──────────────────────────────────────────
Requests/sec:     1,234 ▲ 12%
Allowed:          1,111 (90%)
Rejected:         123 (10%)
Avg Latency:      0.5ms

Top Rate Limited Users:
1. user_123      145 rejections
2. user_456      89 rejections
3. user_789      56 rejections

Algorithm Performance:
Token Bucket:     0.3ms avg
Leaky Bucket:     0.4ms avg
Sliding Window:   0.5ms avg

Security Considerations

1. Don't Rely Solely on IP

// ✅ Good - Multiple identifiers
function getUserIdentifier(req) {
  if (req.user?.id) return `user:${req.user.id}`;
  if (req.apiKey) return `key:${req.apiKey}`;
  return `ip:${req.ip}`;
}

// ❌ Bad - IP only (NATs, proxies cause issues)
function getUserIdentifier(req) {
  return req.ip;
}

2. Protect Rate Limiter Itself

// Rate limiter should be fast and simple
// Don't let rate limiting become bottleneck

// ✅ Good - Fast in-memory check
const allowed = tokenBucket.check(userId);

// ❌ Bad - Slow database lookup in rate limiter
const user = await db.users.find(userId);
const tier = await db.tiers.find(user.tierId);
const allowed = rateLimiter.check(userId, tier.limit);

3. Handle Distributed Attacks

// Track requests per endpoint
const endpointLimiter = {
  '/api/login': new RateLimiter(5, 60000),    // 5/min
  '/api/register': new RateLimiter(3, 60000), // 3/min
  '/api/reset': new RateLimiter(2, 60000)     // 2/min
};

// Global limit across all endpoints
const globalLimiter = new RateLimiter(100, 60000); // 100/min

4. Honeypot Endpoints

// Create decoy endpoints to detect bots
app.post('/api/admin-secret', (req, res) => {
  // Log suspicious activity
  logger.warn('Honeypot accessed', {
    ip: req.ip,
    userAgent: req.get('user-agent')
  });
  
  // Immediately block this IP
  blocklist.add(req.ip, 24 * 60 * 60); // 24 hours
  
  // Return fake success
  res.json({ status: 'ok' });
});

Testing Strategies

Unit Tests

describe('TokenBucket', () => {
  it('should allow requests under limit', () => {
    const bucket = new TokenBucket(10, 1);
    
    for (let i = 0; i < 10; i++) {
      expect(bucket.allowRequest()).toBe(true);
    }
  });
  
  it('should reject requests over limit', () => {
    const bucket = new TokenBucket(10, 1);
    
    // Exhaust bucket
    for (let i = 0; i < 10; i++) {
      bucket.allowRequest();
    }
    
    expect(bucket.allowRequest()).toBe(false);
  });
  
  it('should refill tokens over time', async () => {
    const bucket = new TokenBucket(10, 10); // 10 tokens/sec
    
    // Exhaust bucket
    for (let i = 0; i < 10; i++) {
      bucket.allowRequest();
    }
    
    // Wait 1 second
    await sleep(1000);
    
    // Should have 10 new tokens
    expect(bucket.allowRequest()).toBe(true);
  });
});

Load Testing

// Use k6 or similar tool
import http from 'k6/http';
import { check, sleep } from 'k6';

export let options = {
  stages: [
    { duration: '1m', target: 100 },   // Ramp up
    { duration: '3m', target: 100 },   // Stay
    { duration: '1m', target: 0 },     // Ramp down
  ],
};

export default function() {
  const res = http.get('http://api.example.com/endpoint');
  
  check(res, {
    'status is 200 or 429': (r) => [200, 429].includes(r.status),
    'has rate limit headers': (r) => r.headers['X-RateLimit-Limit'] !== undefined,
  });
  
  sleep(1);
}

Chaos Testing

// Simulate failures
describe('Rate Limiter Resilience', () => {
  it('should handle Redis failures gracefully', async () => {
    // Stop Redis
    await redis.disconnect();
    
    // Should fail open (allow requests)
    const allowed = await rateLimiter.check('user123');
    expect(allowed).toBe(true);
    
    // Should log error
    expect(logger.error).toHaveBeenCalled();
  });
});

Production Checklist

Before Deployment

Load tested under expected traffic
Load tested under 2x expected traffic
Tested failure scenarios
Monitoring and alerts configured
Documentation updated
Rollback plan prepared
Gradual rollout plan (canary/blue-green)

After Deployment

Next: Implementation Guide

FilesExpand file tree

BEST_PRACTICES.md

Latest commit

History

BEST_PRACTICES.md

File metadata and controls

🎯 Best Practices for Rate Limiting

Table of Contents

Choosing the Right Algorithm

Decision Framework

By Application Type

Setting Appropriate Limits

Capacity Planning

Example Calculation

Tiered Limits

Time-based Adjustments

Implementation Guidelines

1. Start Simple

2. Fail Open vs Fail Closed

3. Atomic Operations

4. Memory Management

Distributed Systems

Using Redis

Handling Clock Skew

Distributed Consensus

User Experience

Informative Error Responses

Response Headers

Graceful Degradation

Monitoring & Alerting

Key Metrics

Alert Thresholds

Dashboard Example

Security Considerations

1. Don't Rely Solely on IP

2. Protect Rate Limiter Itself

3. Handle Distributed Attacks

4. Honeypot Endpoints

Testing Strategies

Unit Tests

Load Testing

Chaos Testing

Production Checklist

Before Deployment

After Deployment