chore(backend): Cache invalid opaque OAuth tokens locally#8519
chore(backend): Cache invalid opaque OAuth tokens locally#8519wobsoriano wants to merge 17 commits into
Conversation
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…ts-customer-backend-from
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis pull request adds in-memory negative caching for OAuth tokens to avoid redundant verification of previously invalidated tokens. The implementation includes a new Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/backend/src/tokens/machineTokenRateLimiter.ts`:
- Around line 9-11: The current logic in machineTokenRateLimiter.ts clears the
entire buckets map when buckets.size >= MAX_BUCKETS (the buckets.clear() in the
if block), which allows an attacker to neutralize rate limits by causing key
churn; instead implement bounded eviction: when buckets.size >= MAX_BUCKETS,
evict only the least-recently-used or expired bucket(s) rather than clearing
all. Replace the global clear with an LRU/TTL eviction strategy (e.g., use an
ordered Map and delete the oldest entry(s) keyed by access time, or integrate an
LRU cache library) and update the code paths that read/write buckets so accesses
update recency metadata; keep the MAX_BUCKETS check but perform targeted
deletion of oldest/expired keys instead of buckets.clear().
In `@packages/backend/src/tokens/request.ts`:
- Around line 31-45: extractCallerIp currently trusts cf-connecting-ip /
x-real-ip / x-forwarded-for directly; change it to only use those headers when
the request is proven to come through a trusted proxy (e.g. implement and call
an isFromTrustedProxy(request) or check a trustedProxies config/allowlist) and
otherwise fall back to a non-spoofable source (runtime-provided peer IP / socket
remote address or request.conn/peer info). Update extractCallerIp and any other
places referenced (the same logic around lines 821-828 and 857-864) to gate
header parsing behind that trusted-proxy check and document/centralize the
trusted-proxy config so rate-limiter identity cannot be derived from
unauthenticated headers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 7a19f6b4-647d-4928-84ce-f9434adddd62
📒 Files selected for processing (8)
integration/testUtils/machineAuthHelpers.tsintegration/tests/next-machine.test.tspackages/backend/src/constants.tspackages/backend/src/tokens/__tests__/machineTokenRateLimiter.test.tspackages/backend/src/tokens/__tests__/request.test.tspackages/backend/src/tokens/authStatus.tspackages/backend/src/tokens/machineTokenRateLimiter.tspackages/backend/src/tokens/request.ts
@clerk/astro
@clerk/backend
@clerk/chrome-extension
@clerk/clerk-js
@clerk/dev-cli
@clerk/expo
@clerk/expo-passkeys
@clerk/express
@clerk/fastify
@clerk/hono
@clerk/localizations
@clerk/nextjs
@clerk/nuxt
@clerk/react
@clerk/react-router
@clerk/shared
@clerk/tanstack-react-start
@clerk/testing
@clerk/ui
@clerk/upgrade
@clerk/vue
commit: |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/backend/src/tokens/machineTokenRateLimiter.ts`:
- Around line 9-19: The eviction runs before checking whether the incoming ip
already has a bucket, causing eviction even for requests from existing IPs;
change the logic in the token bucket handling so you first fetch existing =
buckets.get(ip) and only perform the MAX_BUCKETS eviction when existing is
undefined (i.e., about to insert a new bucket). Specifically: get existing
(buckets.get(ip)) before the eviction check, and if existing is falsy and
buckets.size >= MAX_BUCKETS then evict the oldest key; then create bucket =
existing ?? { tokens: MAX_BURST, lastRefill: now } as before. Reference:
buckets, MAX_BUCKETS, ip, existing, Bucket, tokens, lastRefill.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 8d1b6297-4946-45bc-ad4c-c68f1b83111d
📒 Files selected for processing (3)
packages/backend/src/tokens/__tests__/machineTokenRateLimiter.test.tspackages/backend/src/tokens/machineTokenRateLimiter.tspackages/backend/src/tokens/request.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/backend/src/tokens/tests/machineTokenRateLimiter.test.ts
- packages/backend/src/tokens/request.ts
| if (buckets.size >= MAX_BUCKETS) { | ||
| // Evict the oldest entry rather than clearing all buckets to prevent an attacker | ||
| // from neutralizing rate limits by forcing key churn across many distinct IPs. | ||
| const oldest = buckets.keys().next().value; | ||
| if (oldest !== undefined) { | ||
| buckets.delete(oldest); | ||
| } | ||
| } | ||
| const now = Date.now(); | ||
| const existing = buckets.get(ip); | ||
| const bucket: Bucket = existing ?? { tokens: MAX_BURST, lastRefill: now }; |
There was a problem hiding this comment.
Eviction is triggered for existing IPs at capacity, which weakens throttling.
At Line 9, eviction runs before checking whether ip already has a bucket. When buckets.size === MAX_BUCKETS, even requests from existing IPs evict other clients’ buckets, causing unintended rate-limit resets and reducing protection under load.
Suggested fix
export function checkMachineTokenRateLimit(ip: string): boolean {
- if (buckets.size >= MAX_BUCKETS) {
+ const existing = buckets.get(ip);
+ if (existing === undefined && buckets.size >= MAX_BUCKETS) {
// Evict the oldest entry rather than clearing all buckets to prevent an attacker
// from neutralizing rate limits by forcing key churn across many distinct IPs.
const oldest = buckets.keys().next().value;
if (oldest !== undefined) {
buckets.delete(oldest);
}
}
const now = Date.now();
- const existing = buckets.get(ip);
const bucket: Bucket = existing ?? { tokens: MAX_BURST, lastRefill: now };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/backend/src/tokens/machineTokenRateLimiter.ts` around lines 9 - 19,
The eviction runs before checking whether the incoming ip already has a bucket,
causing eviction even for requests from existing IPs; change the logic in the
token bucket handling so you first fetch existing = buckets.get(ip) and only
perform the MAX_BUCKETS eviction when existing is undefined (i.e., about to
insert a new bucket). Specifically: get existing (buckets.get(ip)) before the
eviction check, and if existing is falsy and buckets.size >= MAX_BUCKETS then
evict the oldest key; then create bucket = existing ?? { tokens: MAX_BURST,
lastRefill: now } as before. Reference: buckets, MAX_BUCKETS, ip, existing,
Bucket, tokens, lastRefill.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
packages/backend/src/tokens/oauthTokenRateLimiter.ts (1)
9-19:⚠️ Potential issue | 🟠 Major | ⚡ Quick winEviction runs even for existing IPs at capacity
When
buckets.sizehitsMAX_BUCKETS, Line 9 evicts an entry before checking whether the currentipalready has a bucket. That lets traffic from existing IPs evict other tenants’ buckets and weakens throttling behavior under load.Suggested minimal fix
export function checkOAuthTokenRateLimit(ip: string): boolean { - if (buckets.size >= MAX_BUCKETS) { + const existing = buckets.get(ip); + if (existing === undefined && buckets.size >= MAX_BUCKETS) { // Evict the oldest entry rather than clearing all buckets to prevent an attacker // from neutralizing rate limits by forcing key churn across many distinct IPs. const oldest = buckets.keys().next().value; if (oldest !== undefined) { buckets.delete(oldest); } } const now = Date.now(); - const existing = buckets.get(ip); const bucket: Bucket = existing ?? { tokens: MAX_BURST, lastRefill: now };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/backend/src/tokens/oauthTokenRateLimiter.ts` around lines 9 - 19, The eviction currently runs unconditionally when buckets.size >= MAX_BUCKETS which allows an existing ip to evict others; change the logic in oauthTokenRateLimiter so you first check for an existing bucket via buckets.get(ip) (reference: buckets, existing, ip, Bucket), and only if existing is undefined AND buckets.size >= MAX_BUCKETS perform the eviction (use the existing oldest = buckets.keys().next().value and buckets.delete(oldest)). Then proceed to create or reuse the bucket as before.packages/backend/src/tokens/request.ts (1)
31-48:⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy liftRate-limit identity is taken from spoofable client headers
The limiter key is built from
cf-connecting-ip/x-real-ip/x-forwarded-forwithout a trusted-proxy gate. If requests can reach this code directly (or through untrusted hops), attackers can rotate or forge header values to bypass per-IP limits or target another bucket.Please gate header-based extraction behind an explicit trusted-proxy policy and otherwise use a non-spoofable peer IP source provided by the runtime/network edge.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/backend/src/tokens/request.ts` around lines 31 - 48, The current extractCallerIp function trusts client-controlled headers (cf-connecting-ip / x-real-ip / x-forwarded-for) unconditionally; change it to only trust those headers when the request comes through a configured trusted proxy (e.g., add a trusted-proxy check like isRequestFromTrustedProxy(request) or an options.allowProxyHeaders flag), and otherwise derive the IP from a non-spoofable peer source (e.g., request.socket?.remoteAddress or the runtime-provided peer IP). Update extractCallerIp (or its callers) to accept the trusted-proxy context/config and gate the header reading behind that check so header values are never used unless the proxy is known/trusted.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@packages/backend/src/tokens/oauthTokenRateLimiter.ts`:
- Around line 9-19: The eviction currently runs unconditionally when
buckets.size >= MAX_BUCKETS which allows an existing ip to evict others; change
the logic in oauthTokenRateLimiter so you first check for an existing bucket via
buckets.get(ip) (reference: buckets, existing, ip, Bucket), and only if existing
is undefined AND buckets.size >= MAX_BUCKETS perform the eviction (use the
existing oldest = buckets.keys().next().value and buckets.delete(oldest)). Then
proceed to create or reuse the bucket as before.
In `@packages/backend/src/tokens/request.ts`:
- Around line 31-48: The current extractCallerIp function trusts
client-controlled headers (cf-connecting-ip / x-real-ip / x-forwarded-for)
unconditionally; change it to only trust those headers when the request comes
through a configured trusted proxy (e.g., add a trusted-proxy check like
isRequestFromTrustedProxy(request) or an options.allowProxyHeaders flag), and
otherwise derive the IP from a non-spoofable peer source (e.g.,
request.socket?.remoteAddress or the runtime-provided peer IP). Update
extractCallerIp (or its callers) to accept the trusted-proxy context/config and
gate the header reading behind that check so header values are never used unless
the proxy is known/trusted.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: c23325b7-a06b-4701-8474-9c52f3f23b11
📒 Files selected for processing (6)
packages/backend/src/tokens/__tests__/oauthTokenRateLimiter.test.tspackages/backend/src/tokens/__tests__/request.test.tspackages/backend/src/tokens/authStatus.tspackages/backend/src/tokens/machine.tspackages/backend/src/tokens/oauthTokenRateLimiter.tspackages/backend/src/tokens/request.ts
…ts-customer-backend-from
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/backend/src/tokens/request.ts`:
- Around line 813-816: The code currently calls maybeCacheOAuthTokenAsInvalid
for all machine-token prefixes; change it so negative caching runs only for
OAuth machine tokens by checking the tokenType (from verifyMachineAuthToken) or
the tokenInHeader prefix before calling maybeCacheOAuthTokenAsInvalid.
Specifically, in the error paths around verifyMachineAuthToken and the similar
block at 845-848, wrap the maybeCacheOAuthTokenAsInvalid(token, ...) call in a
guard that confirms tokenType === 'oath' or tokenInHeader startsWith('oat_')
(use the actual tokenType value your verifier returns) and only then call
maybeCacheOAuthTokenAsInvalid, leaving handleMachineError unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: ff448b2f-6c31-4ffa-8693-84bf23f22b4e
📒 Files selected for processing (5)
integration/testUtils/machineAuthHelpers.tspackages/backend/src/tokens/__tests__/oauthNegativeCache.test.tspackages/backend/src/tokens/__tests__/request.test.tspackages/backend/src/tokens/oauthNegativeCache.tspackages/backend/src/tokens/request.ts
✅ Files skipped from review due to trivial changes (1)
- packages/backend/src/tokens/tests/oauthNegativeCache.test.ts
Description
An attacker can send requests with random
oat_...bearer tokens to any endpoint protected byclerkMiddleware. Each request triggers a call to the BAPI OAuth token verification endpoint, which counts against the customer's global Cloudflare WAF rate-limit cap. Enough parallel requests can exhaust this cap, causing legitimate BAPI calls to be throttled.Why only
oat_tokens and notak_ormt_?API key and M2M token verification are served entirely at the Cloudflare Edge and never reach the BAPI origin. They are fully exempt from BAPI rate-limit caps. OAuth token verification does reach the BAPI origin, so each call consumes quota from the customer's shared cap.
The fix: in-process negative cache
When BAPI returns a definitive "token not found" response for an
oat_token, we cache that token string as invalid for a short window. On the next request with the same token, we reject it locally without calling BAPI. The cache is bounded to prevent unbounded memory growth, with oldest-first eviction.Only definitive "token not found" errors are cached. Transient errors (invalid secret key, unexpected errors) are not cached, so a customer who fixes a misconfiguration recovers immediately.
Checklist
pnpm testruns as expected.pnpm buildruns as expected.Type of change