Harden backend security, add global rate limiting and deployment runbook#55
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (4)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
| { 'Retry-After': String(Math.max(retryAfterSeconds, 1)) } | ||
| ); |
There was a problem hiding this comment.
🔴 Missing return after sending 429 response in login rate-limit check causes request to fall through
When a login request is blocked by the new in-memory rate limiter, sendJson is called at line 294-302 to send a 429 response, but there is no return statement afterward. This means execution continues past the rate-limit guard into the credential-checking logic below, potentially sending a second response on the same res object (causing a "headers already sent" error) or allowing a blocked attacker to still attempt login.
Root Cause and Impact
At backend/server.js:292-302, the new rate-limit block checks rateLimitState.blockedUntil > now and calls sendJson(res, 429, ...) but never returns:
if (rateLimitState.blockedUntil > now) {
const retryAfterSeconds = Math.ceil((rateLimitState.blockedUntil - now) / 1000);
sendJson(res, 429, { ... }, { 'Retry-After': ... });
// ← missing return here!
}After sending the 429, the handler falls through to database.getUserByCredentials(username, password) at line 312, which defeats the purpose of the brute-force rate limiter entirely. If the credentials happen to be correct, a successful login response would also be attempted on the already-written response, causing a Node.js runtime error.
Impact: The login brute-force protection is completely bypassed — blocked users can still authenticate.
| { 'Retry-After': String(Math.max(retryAfterSeconds, 1)) } | |
| ); | |
| { 'Retry-After': String(Math.max(retryAfterSeconds, 1)) } | |
| ); | |
| return; |
Was this helpful? React with 👍 or 👎 to provide feedback.
| const recordFailedLoginAttempt = (key) => { | ||
| const now = Date.now(); | ||
| const state = getRateLimitState(key); | ||
| state.count += 1; | ||
|
|
||
| if (state.count >= LOGIN_RATE_LIMIT_MAX_ATTEMPTS) { | ||
| state.blockedUntil = now + LOGIN_RATE_LIMIT_BLOCK_MS; | ||
| } | ||
| }; |
There was a problem hiding this comment.
🔴 New in-memory login rate-limit state is checked but never populated — brute-force protection is ineffective
The PR adds recordFailedLoginAttempt (line 160) and clearRateLimitState (line 78) to manage the new in-memory loginAttempts Map, and the login handler checks this Map via getRateLimitState (line 290). However, the login handler still calls the old cache-based rateLimiter.recordFailure (line 315) and rateLimiter.clear (line 324) instead of the new functions.
Root Cause and Impact
The new in-memory rate-limit check at backend/server.js:290-302 reads from the loginAttempts Map:
const rateLimitState = getRateLimitState(loginKey);
if (rateLimitState.blockedUntil > now) { ... }But on failed login, the code records the failure in the old cache-based system instead:
await rateLimiter.recordFailure(loginKey, { ... }); // line 315And on success, it clears the old system:
await rateLimiter.clear(loginKey); // line 324The new recordFailedLoginAttempt function (line 160) that would populate loginAttempts is never called. As a result, rateLimitState.blockedUntil will always be 0 (the initial value from getRateLimitState), and the in-memory rate-limit check will never trigger.
Meanwhile, the old cache-based rateLimiter.getBlockedSeconds check (line 303) still exists due to the bad merge, creating a confusing dual system where neither works correctly together.
Impact: The new in-memory brute-force rate limiting is completely non-functional. Attackers can make unlimited login attempts without being blocked by the in-memory limiter.
Prompt for agents
In backend/server.js, the login handler needs to be updated to use the new in-memory rate limiting functions instead of the old cache-based ones. Specifically:
1. At line 315, replace `await rateLimiter.recordFailure(loginKey, { ... })` with `recordFailedLoginAttempt(loginKey)`
2. At line 324, replace `await rateLimiter.clear(loginKey)` with `clearRateLimitState(loginKey)`
3. Remove the old duplicate rate-limit check at lines 303-310 that uses `rateLimiter.getBlockedSeconds`
4. Add a `return` statement after the sendJson call at line 302 (the new rate-limit 429 response)
This will make the in-memory rate limiting actually functional and remove the dead code from the bad merge.
Was this helpful? React with 👍 or 👎 to provide feedback.
| const sendJson = (res, statusCode, body, extraHeaders = {}) => { | ||
| const sendJson = (res, statusCode, body) => { |
There was a problem hiding this comment.
🔴 Duplicate sendJson function declaration — new signature with extraHeaders is shadowed by old declaration
At lines 170-171, there are two consecutive sendJson declarations. The new version accepts extraHeaders (line 170) but is immediately followed by the old version without that parameter (line 171). In non-strict mode, the second declaration would shadow the first, losing the extraHeaders parameter.
Root Cause and Impact
The merge produced:
const sendJson = (res, statusCode, body, extraHeaders = {}) => {
const sendJson = (res, statusCode, body) => {The function body uses ...extraHeaders at line 189, but if the second declaration takes effect, extraHeaders would be undefined, and ...undefined in an object literal is a no-op. This means the Retry-After headers passed as extraHeaders from the global rate limiter (line 248) and login rate limiter (line 301) would be silently dropped.
Note: This is also a syntax error that prevents the file from running, but the logical consequence is that even if one declaration were removed, the wrong one being kept would break the extraHeaders feature.
Impact: Rate-limit Retry-After headers would not be sent to clients if the old declaration is the one that survives.
| const sendJson = (res, statusCode, body, extraHeaders = {}) => { | |
| const sendJson = (res, statusCode, body) => { | |
| const sendJson = (res, statusCode, body, extraHeaders = {}) => { |
Was this helpful? React with 👍 or 👎 to provide feedback.
| GLOBAL_RATE_LIMIT_WINDOW_MS: z.string().regex(/^\d+$/).optional(), | ||
| SECURITY_HEADERS_CSP: z.string().optional(), | ||
| DATABASE_URL: z.string().optional(), | ||
| REDIS_URL: z.string().optional(), |
There was a problem hiding this comment.
🟡 Duplicate REDIS_URL key in Zod schema — stricter .url() validation overrides the lenient one
The env schema defines REDIS_URL twice: once at line 62 as z.string().optional() (new, lenient) and again at line 73 as z.string().url().optional() (old, strict). In a JavaScript object literal, the second property with the same key wins.
Root Cause and Impact
In backend/env.js, the Zod schema object has:
REDIS_URL: z.string().optional(), // line 62 (new)
// ... other fields ...
REDIS_URL: z.string().url().optional(), // line 73 (old)The second definition overrides the first. The new code at line 62 intentionally relaxed the validation to z.string().optional() (allowing non-URL strings like Redis connection strings that may not be valid URLs), but the old stricter z.string().url().optional() at line 73 takes precedence. This means REDIS_URL values like redis://user:pass@host:6379 that aren't strictly valid URLs per Zod's .url() check could fail validation unexpectedly.
Impact: Environment validation may reject valid Redis connection strings that don't pass strict URL validation.
Prompt for agents
In backend/env.js, remove the duplicate REDIS_URL key. The schema at line 73 (`REDIS_URL: z.string().url().optional()`) is the old pre-existing definition. The new one at line 62 (`REDIS_URL: z.string().optional()`) was added by this PR. Decide which validation is appropriate and keep only one. If Redis connection strings like `redis://...` are always valid URLs, keep the `.url()` version. Otherwise, keep the lenient `z.string().optional()` version. Remove the other.
Was this helpful? React with 👍 or 👎 to provide feedback.
| const getGlobalRateLimitState = (key) => { | ||
| const now = Date.now(); | ||
| const existing = globalRequests.get(key); | ||
|
|
||
| if (!existing || now - existing.windowStart > GLOBAL_RATE_LIMIT_WINDOW_MS) { | ||
| const state = { count: 0, windowStart: now }; | ||
| globalRequests.set(key, state); | ||
| return state; | ||
| } | ||
|
|
||
| return existing; | ||
| }; |
There was a problem hiding this comment.
🔴 globalRequests Map grows unboundedly — no cleanup of stale IP entries causes memory leak
The globalRequests Map at line 23 stores rate-limit state per IP address but never removes entries for IPs that stop making requests. Over time, this Map will grow without bound.
Root Cause and Impact
In backend/server.js:54-65, getGlobalRateLimitState creates a new entry in globalRequests for every unique IP. When a window expires, the entry is replaced (line 59-61), but entries for IPs that never return are never deleted:
const getGlobalRateLimitState = (key) => {
const now = Date.now();
const existing = globalRequests.get(key);
if (!existing || now - existing.windowStart > GLOBAL_RATE_LIMIT_WINDOW_MS) {
const state = { count: 0, windowStart: now };
globalRequests.set(key, state); // entry persists forever
return state;
}
return existing;
};Similarly, loginAttempts (line 22) has the same issue — clearRateLimitState exists but is never called.
In a production environment exposed to the internet, unique IPs from bots, scanners, and legitimate users will accumulate indefinitely, causing gradual memory growth that could eventually exhaust available memory.
Impact: Slow memory leak in production that could lead to OOM crashes over extended uptime periods, especially under diverse traffic patterns.
Prompt for agents
In backend/server.js, add periodic cleanup for both the `globalRequests` Map (line 23) and the `loginAttempts` Map (line 22). One approach is to add a setInterval that runs every GLOBAL_RATE_LIMIT_WINDOW_MS (e.g., every 15 minutes) and deletes entries whose window has expired:
setInterval(() => {
const now = Date.now();
for (const [key, state] of globalRequests) {
if (now - state.windowStart > GLOBAL_RATE_LIMIT_WINDOW_MS) {
globalRequests.delete(key);
}
}
for (const [key, state] of loginAttempts) {
if (state.blockedUntil > 0 && state.blockedUntil <= now) {
loginAttempts.delete(key);
} else if (now - state.windowStart > LOGIN_RATE_LIMIT_WINDOW_MS) {
loginAttempts.delete(key);
}
}
}, GLOBAL_RATE_LIMIT_WINDOW_MS);
Place this after the Map declarations around line 23.
Was this helpful? React with 👍 or 👎 to provide feedback.
Motivation
Description
backend/server.js: centralizedsendJsonnow emits many Helmet-style headers (CSP, HSTS, X-Frame-Options, no-sniff, cross-origin protections) and acceptsSECURITY_HEADERS_CSPfrom env.backend/server.js(GLOBAL_RATE_LIMIT_MAX_REQUESTS,GLOBAL_RATE_LIMIT_WINDOW_MS) that returns429withRetry-After, preserved the existing loginIP+usernamebrute-force limiter and addedRetry-Afterthere as well.getRequestIphelper and improvedgetLoginKey/rate state handling, and extendedbackend/env.jsto validate new env vars for global rate limiting, CSP, deployment integrations (DATABASE_URL,REDIS_URL, Upstash,STORAGE_DRIVER, AWS/Cloudinary keys).backend/README.mdSecurity Layer section and addedbackend/deployment.mdwith Render/Railway/AWS EC2 guidance and recommended env settings for Supabase/Neon, Upstash, and S3/Cloudinary.Testing
npm run build, which completed successfully.node backend/server.js, which failed in this environment due to missing runtime dependency (ERR_MODULE_NOT_FOUND: Cannot find package 'dotenv'), so a full runtime smoke test could not be completed.Codex Task