feat(auth): support Claude Code subscription tokens via secret.value prefix detection#2
Merged
Conversation
…iption token
Adds a first-class subscription-auth option for executor/judge sandbox roles:
"agents": {
"executor": { "command": "claude", "useOAuth": true },
"judge": { "command": "claude", "useOAuth": true }
}
Why this exists: per-token API billing for a full A/B sweep against a
real SDK costs ~$135-$270 (Opus 4.7); we just stopped a run partway in
at ~$30 sunk. Claude Code on a Pro/Max/Team/Enterprise plan can
authenticate via a long-lived subscription token instead — flat-rate
billing tied to the plan.
How it works: the framework's existing `secret` path uses microsandbox
TLS-injection — the cleartext value never enters the VM, only a
placeholder substituted on the wire for the allowed host. That model is
fundamentally incompatible with `CLAUDE_CODE_OAUTH_TOKEN` because Claude
reads the token directly from `process.env`. So `useOAuth: true`:
- Resolves CLAUDE_CODE_OAUTH_TOKEN from the host environment
(fail-fast with a setup-token hint if unset)
- Injects it into the sandbox as a plain env var via `sandbox.env`
- Skips `buildAgentSecret` for that role (no API key in env at all,
so Claude's auth precedence falls through cleanly to OAuth)
- Sets ANTHROPIC_BASE_URL from the adapter default
- For judge: contributes the adapter default hostname to the network
lockdown allowlist
Validation: exactly one of `secret` or `useOAuth: true` must be set per
sandbox role. `useOAuth: true` requires `command: "claude"`. Setting
both is rejected. Adapter-side enforcement keeps the auth surface
intentionally narrow.
User flow (one-time host setup):
claude setup-token # interactive, ~1 yr token
export CLAUDE_CODE_OAUTH_TOKEN='<token>' # then run the eval
Tests: 354 pass (added 6 — 4 config validation + 2 resolveOAuthToken).
README + config-schema reference document the new path. Type-check clean.
…uth flag
Replaces the dedicated `useOAuth: true` field on `SandboxAgentConfig` with
runtime prefix detection on the resolved `secret.value`:
- `sk-ant-api-…` → API-key path (microsandbox TLS injection, unchanged)
- `sk-ant-oat-…` → OAuth path (plain `CLAUDE_CODE_OAUTH_TOKEN` env var)
User experience is now one consistent shape — always set `secret.value` to a
host env var reference; the runtime picks the auth mode from the resolved
value at sandbox-create time:
"executor": { "command": "claude", "secret": { "value": "$ANTHROPIC_API_KEY" } }
"executor": { "command": "claude", "secret": { "value": "$CLAUDE_CODE_OAUTH_TOKEN" } }
Mechanics:
- New `applyAgentAuth(secret, adapter, secrets, env)` helper in
`microsandbox.ts` consolidates the 3 sandbox-creation call sites
(`execute.ts`, `judge.ts`, `sandbox.ts`) into a single call. The helper
handles both auth modes internally.
- New `isOAuthSecret(secret)` helper exposes the same prefix check for the
judge's `buildJudgeAllowlist` (which adds the adapter's default hostname
to the lockdown allowlist when in OAuth mode, since the OAuth path has no
`secret.baseUrl` to derive from).
- `SandboxAgentConfig.secret` becomes required again (no parallel field).
- Removed `resolveOAuthToken()` and `buildAgentSecret()` — both subsumed by
`applyAgentAuth`.
Trade-off: depends on Anthropic's documented OAuth-vs-API-key prefix scheme
(`sk-ant-oat-` vs `sk-ant-api-`). If that scheme changes, the eval silently
misclassifies. Caller failure mode is an auth error from the Claude API at
request time, which is observable in run logs.
Tests: drop 4 useOAuth config tests + 2 resolveOAuthToken tests; add 3 tests
for `isOAuthSecret` (OAuth value, API-key value, unset env var) and 3 tests
for `applyAgentAuth` (OAuth path, API-key path, missing required fields).
331 tests pass; type-check + lint clean.
…t sk-ant-oat- Smoke-tested against a real `claude setup-token` token and discovered the prefix is `sk-ant-oat01-…`, not `sk-ant-oat-…`. The trailing dash in OAUTH_TOKEN_PREFIX caused all real OAuth tokens to misclassify as API keys and route through the TLS-injection path. Dropping the trailing dash: - matches every documented variant: `sk-ant-oat01-…`, `sk-ant-oat02-…`, etc. - still cleanly distinguishes from API keys (`sk-ant-api…`) since `oat` ≠ `api`. Test fixtures and docs updated to the real `sk-ant-oat01-` form. Verified end-to-end with the smoke script: isOAuthSecret returns true, applyAgentAuth populates env.CLAUDE_CODE_OAUTH_TOKEN as plain env var, no TLS secrets added. 331 tests pass; type-check + lint clean.
The original PR routed OAuth through plain env-var injection on the assumption that Claude Code's CLAUDE_CODE_OAUTH_TOKEN reader was incompatible with microsandbox's wire-time placeholder substitution. Smoke-testing the placeholder path against a real Claude Code session proved that wrong: Claude tolerates a `\$MSB_CLAUDE_CODE_OAUTH_TOKEN` placeholder as the env var value, constructs `Authorization: Bearer \$MSB_…` as the outbound header, and microsandbox substitutes the placeholder for the real OAuth token at TLS interception time — Anthropic returned 200 on `/api/eval/sdk-…` and the eval completed end-to-end. Collapse the two-mode dispatch in `applyAgentAuth` into one TLS-substituted path. The resolved value's prefix only picks the env var name that carries the placeholder: - `sk-ant-oat…` → `CLAUDE_CODE_OAUTH_TOKEN` - anything else → `secret.envVar` (= `ANTHROPIC_API_KEY` for claude, etc.) Benefits: - OAuth recovers the same "cleartext never enters the VM" security property API keys already had — the real subscription token only ever touches the outbound TLS layer to api.anthropic.com. - One code path, fewer test modes. \`isOAuthSecret\` was only used to pick the env var name (now inlined as a local conditional) and to choose the allowlist hostname in \`buildJudgeAllowlist\` — but both auth paths now derive that hostname from \`secret.baseUrl\` (validation already fills it from the adapter default for known agents), so the OAuth branch in \`buildJudgeAllowlist\` is gone too. - Less surface area in the public module API (\`isOAuthSecret\` removed from exports). Tests collapse from a four-test isOAuthSecret + applyAgentAuth suite to three unified \`applyAgentAuth\` cases (OAuth value → CLAUDE_CODE_OAUTH_TOKEN slot; API-key value → adapter slot; precondition error). 328 unit tests pass; type-check + lint clean. End-to-end verified against TC-001 with a real CLAUDE_CODE_OAUTH_TOKEN: exit 0, 27s, real solution produced; egress log shows \`Authorization: Bearer \$MSB_CLAUDE_CODE_OAUTH_TOKEN\` (pre-substitution), \`/api/claude_code/settings\` returns 404 (auth accepted; would be 401 if the placeholder leaked to the wire).
The schema reference previously read like the `sk-ant-oat…` / `sk-ant-api…` prefix dispatch was a generic feature across all adapters. In reality it's a claude-only fork — codex, gemini, and custom agents only have the API-key path today. Reframe the SandboxAgentConfig description so the default behavior leads (TLS substitution into the adapter-default env var) and the OAuth slot is clearly tagged as the claude-specific opt-in.
…flection in tests Code review on the PR surfaced four cleanups, all in the diff and none changing runtime behaviour: - `applyAgentAuth` accepted a local `AgentAuthAdapter` interface that duplicated three fields of the exported `AgentAdapter`. Switched to `Pick<AgentAdapter, 'baseUrlEnvVar' | 'additionalAllowHosts'>` so the shape stays tied to the real adapter type and unused fields (`defaultBaseUrl`) drop out. - Added a paired `OAUTH_TOKEN_ENV_VAR = 'CLAUDE_CODE_OAUTH_TOKEN'` constant next to `OAUTH_TOKEN_PREFIX` so the two Anthropic-specific strings live together; replaced the inline literal. - The `applyAgentAuth` tests previously inspected the opaque `SecretEntry` shape through a three-field reflection helper — fragile if the microsandbox SDK ever renames an internal field. Tests now assert on `Secret.env`'s call args (already mocked in this file), so we verify the wire contract rather than the library's internal representation. - Dropped dead `isOAuthSecret: vi.fn()` mock entries in `execute.test.ts` and `judge.test.ts` (the function was unexported in an earlier refactor commit; the mocks were leftovers). - Trimmed a verbose comment in `buildJudgeAllowlist` that restated what the surrounding `if` already expressed. 328 tests pass; type-check + lint clean. No runtime behaviour change, so no re-smoke needed.
HungKNguyen
reviewed
May 15, 2026
| } | ||
| const value = resolveValue(secret.value, secret.envVar); | ||
|
|
||
| const envVar = value.startsWith(OAUTH_TOKEN_PREFIX) |
Collaborator
There was a problem hiding this comment.
My wishlist here is to better compartmentalize the location of each part of the PR. The current architectural is that
- /agents/ contains information about each agent's config, hardcoded secrets, desired env variable
- /core/config.ts handle parsing the config.json and deciding the correct secret.envVar (based on agent's wishes)
- applyAgentAuth should just follow config.ts instruction and inject the secret with actual value
This PR is mising these areas together, but I get why, since the current flow is agent -> config -> secret, but now you need the secret to decide the config. Let me see if I can create a follow up PR for this
HungKNguyen
approved these changes
May 15, 2026
Collaborator
HungKNguyen
left a comment
There was a problem hiding this comment.
Can approve, concern can be a follow up
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
A full A/B sweep against a real SDK costs roughly $135–$270 in per-token API billing (Opus-class models). Claude Code on a Pro / Max / Team / Enterprise plan can authenticate with a long-lived subscription token instead, giving flat-rate billing tied to the plan.
Summary
SandboxAgentConfigstays a single shape — both auth modes flow through the same microsandboxSecret.env()TLS-substitution code path. The runtime sniffs the resolvedsecret.value's prefix at sandbox-create time and picks which env var name the placeholder lands under:sk-ant-api…ANTHROPIC_API_KEY(=secret.envVar)sk-ant-oat…CLAUDE_CODE_OAUTH_TOKENIn both cases the cleartext never enters the VM — microsandbox swaps the
$MSB_<env-var-name>placeholder for the real value only on outbound TLS to the allowed host (api.anthropic.com for claude).Point
secret.valueat the host env var that holds your credential:You can mix-and-match per role if you want.
User flow for OAuth
Then set
secret: { value: "\$CLAUDE_CODE_OAUTH_TOKEN" }on the sandbox roles you want to bill against the subscription.End-to-end verification
Smoke-tested against TC-001 with a real Claude Code subscription token + microsandbox VM:
Authorization: Bearer \$MSB_CLAUDE_CODE_OAUTH_TOKEN(the placeholder, pre-substitution) leaving claude/api/eval/sdk-…returned 200 and/api/claude_code/settingsreturned 404 (auth accepted; would be 401 if the placeholder had leaked to the wire) — confirming microsandbox swapped the placeholder for the real OAuth token at TLS timeImplementation
applyAgentAuth(secret, adapter, secrets, env)insrc/sandbox/microsandbox.tsis now a single path that always pushes aSecret.env(...)entry; the prefix only picks the env var name.buildJudgeAllowlistsimplified — both modes derive the agent-host allowlist entry fromsecret.baseUrl(always populated by validation from adapter defaults for known agents).isOAuthSecrethelper from the public module surface (inlined as a local conditional insideapplyAgentAuth).resolveOAuthToken()andbuildAgentSecret()helpers — both subsumed byapplyAgentAuth.Trade-off
Depends on Anthropic's documented OAuth-vs-API-key prefix scheme (
sk-ant-oat…vssk-ant-api…). If that scheme ever changes the eval would route to the wrong env var slot and Claude would respond with an auth error visible in run logs.Verification