Conversation
Adds two optional fields to AgentMeta so hosts that resolve agents into the gateway can declare which CLI runtime should run them inside the sandbox sidecar: harness?: string e.g. "claude-code", "opencode", "codex" harnessModel?: string e.g. "sonnet", "anthropic/claude-sonnet-4-5" Why this lives on AgentMeta and not in a new gateway path: The gateway core stays SandboxBox-shaped. Hosts implement `getSandbox()` and produce a SandboxBox whose streamPrompt knows how to talk to its backend. The harness fields are the contract: when set, the host SHOULD return a SandboxBox whose streamPrompt POSTs to the sidecar's `/agent/invoke/chat/completions` (the universal harness invocation endpoint shipped in agent-dev-container PR #1049) with `model: "<harness>/<harnessModel>"`. When unset, the host falls back to its template proxy (legacy /agent/invoke). This keeps the abstraction tight: the gateway doesn't grow a per- harness HTTP plane, and the host's SandboxBox impl does the translation in one place. No core middleware logic changes — every test in the suite (10 files, 137 tests) still passes against the new types.
✅ No Blockers —
|
| kimi-code | deepseek | aggregate | |
|---|---|---|---|
| Readiness | 92 | 94 | 92 |
| Confidence | 95 | 96 | 95 |
| Correctness | 92 | 95 | 92 |
| Security | 95 | 93 | 93 |
| Testing | 85 | 88 | 85 |
| Architecture | 90 | 92 | 90 |
I read every changed file, traced callers/callees, ran the full test suite (137/137 pass), and verified TypeScript compiles cleanly. The PR adds a well-placed
authorizeConsumerhook, fixes a critical security vulnerability where missingverifySignerwould accept forged x402 payments, and adds defense-in-depth boot guards. The only gap is a missing unit test for the changedverifyX402fallback branch. | Read all 5 changed files in full, ran 137 tests (all pass), typecheck clean. The authorizeConsumer hook is correctly placed between rate-limit and sandbox allocation, uses a proper discrim
🟡 LOW Nonce consumed before on-chain verification completes — src/verify.ts
Line 41 marks the nonce as seen before calling config.verifySigner at line 45. If on-chain verification later fails, the nonce is already burned from the replay store, blocking the legitimate consumer from retrying with the same nonce. This is pre-existing behavior (not introduced by this PR) and serves as DoS protection against spoofed nonces, but the comment docs at lines 8-13 don't document this ordering contract.
🟡 LOW No test for authorizeConsumer throwing synchronously — tests/middleware.test.ts
The authorizeConsumer hook at src/middleware.ts:256 is called without a try/catch. If the host-supplied hook throws (e.g. DB connection failure), the error propagates to Hono's default error handler returning 500. This is acceptable behavior but untested — add a test verifying that a throwing authorizeConsumer results in a 500 rather than a silent hang.
🟡 LOW authorizeConsumer tests only cover x402 payment path — tests/middleware.test.ts
Lines 470-521 test authorizeConsumer exclusively via the X-Payment-Signature (x402) auth path. MPP and API key are untested paths through the same code, meaning a regression in how paymentMethod='apikey' or keyId flows into the consumer info argument (src/middleware.ts:256-261) would go undetected. Add one test with a Bearer API key and one with a Payment MPP header to cover those variants.
tangletools · 2026-05-06T20:00:29Z · trace
tangletools
left a comment
There was a problem hiding this comment.
✅ Approved — 3 non-blocking findings
I read every changed file, traced callers/callees, ran the full test suite (137/137 pass), and verified TypeScript compiles cleanly. The PR adds a well-placed
authorizeConsumerhook, fixes a critical security vulnerability where missingverifySignerwould accept forged x402 payments, and adds defense-in-depth boot guards. The only gap is a missing unit test for the changedverifyX402fallback branch. | Read all 5 changed files in full, ran 137 tests (all pass), typecheck clean. The authorizeConsumer hook is correctly placed between rate-limit and sandbox allocation, uses a proper discrim
🟡 LOW Nonce consumed before on-chain verification completes — src/verify.ts
Line 41 marks the nonce as seen before calling config.verifySigner at line 45. If on-chain verification later fails, the nonce is already burned from the replay store, blocking the legitimate consumer from retrying with the same nonce. This is pre-existing behavior (not introduced by this PR) and serves as DoS protection against spoofed nonces, but the comment docs at lines 8-13 don't document this ordering contract.
🟡 LOW No test for authorizeConsumer throwing synchronously — tests/middleware.test.ts
The authorizeConsumer hook at src/middleware.ts:256 is called without a try/catch. If the host-supplied hook throws (e.g. DB connection failure), the error propagates to Hono's default error handler returning 500. This is acceptable behavior but untested — add a test verifying that a throwing authorizeConsumer results in a 500 rather than a silent hang.
🟡 LOW authorizeConsumer tests only cover x402 payment path — tests/middleware.test.ts
Lines 470-521 test authorizeConsumer exclusively via the X-Payment-Signature (x402) auth path. MPP and API key are untested paths through the same code, meaning a regression in how paymentMethod='apikey' or keyId flows into the consumer info argument (src/middleware.ts:256-261) would go undetected. Add one test with a Bearer API key and one with a Payment MPP header to cover those variants.
tangletools · 2026-05-06T20:00:29Z · trace
Summary\n- add optional
authorizeConsumerhook after payment/rate-limit and before sandbox allocation\n- fail closed when x402 production verification is not configured unless explicit demo mode is enabled\n- add regression tests for denied/allowed consumers, sandbox allocation avoidance, and production boot guard\n- bump package version to 0.5.0, already published to npm\n\n## Verification\n- pnpm test\n- pnpm typecheck\n- pnpm build\n- npm pack --dry-run\n- npm view confirms latest=0.5.0