Skip to content

feat(executor): first-class MCP server support via executorMcpServers#6

Open
nickwinder wants to merge 5 commits into
mainfrom
nick/agentic-usability/executor-mcp-servers-only
Open

feat(executor): first-class MCP server support via executorMcpServers#6
nickwinder wants to merge 5 commits into
mainfrom
nick/agentic-usability/executor-mcp-servers-only

Conversation

@nickwinder
Copy link
Copy Markdown
Contributor

Why

A Claude plugin's bundled mcpServers are not loaded by claude --plugin-dir, so MCP servers need their own first-class config surface alongside executorPlugins. Adds it for the Claude and Codex executors. Motivating use case: A/B sweeps that compare an SDK-shape baseline against a run with an MCP server wired in (e.g. Nutrient's Web MCP).

Summary

  • New top-level executorMcpServers config field — types, validation, scaffolding, and per-adapter install. Supports local, git, and sourceless (e.g. npx) entries. ${MCP_ROOT} in a server's args is substituted at install time with the absolute sandbox path of the uploaded server.
  • Claude wiring: uploads to $HOME/.mcp-servers/<name>/, writes mcp-config.json, surfaces as --mcp-config (non-strict, so ambient config is preserved). Codex wiring: uploads to $CODEX_HOME/.mcp-servers/<name>/, appends [mcp_servers.<name>] TOML blocks to config.toml. Gemini/custom adapters throw a clear unsupported error rather than silently no-oping an A/B comparison.
  • Per-server uploads run in parallel; plugin + MCP install blocks run concurrently inside executeTestCase. ResolvedExecutorMcpServer is a discriminated union (kind: 'sourced' | 'sourceless') so consumers narrow at compile time.
  • MCP payload upload lives in its own module (src/sandbox/mcp.ts) with a dedicated tar+upload path that archives the tree verbatim (incl. node_modules) — getSourceArchive / uploadDirToSandbox stay strictly for SDK source uploads (strip bloat) so the two contracts don't mix.

372/372 tests passing; tsc --noEmit clean.

nickwinder and others added 4 commits May 21, 2026 16:51
Add a separate, parallel mechanism to executorPlugins for installing MCP
servers into the executor sandbox. A Claude plugin's bundled mcpServers are
not loaded by `claude --plugin-dir`, so MCP servers need their own config
surface.

- types: ExecutorMcpServer union (local/git/sourceless) + ResolvedExecutorMcpServer
- config: validateExecutorMcpServerEntry — slug names, string command/args,
  optional local|git type, rejects ${MCP_ROOT} when sourceless
- scaffolding: resolveExecutorMcpServers mirrors resolveExecutorPlugins
- adapter/base: installMcpServersInSandbox contract; default throws
- claude: uploads sourced servers to $HOME/.mcp-servers/<name>/, substitutes
  ${MCP_ROOT}, writes mcp-config.json, emits --mcp-config (no --strict)
- codex: uploads to $CODEX_HOME/.mcp-servers/<name>/, appends [mcp_servers.*]
  TOML blocks to config.toml
- gemini/custom: throw a clear unsupported error
- execute: install block parallel to plugins, writes mcp-server-install-error.log
- tests + config-schema.md docs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
installMcpServersInSandbox uploaded via getSourceArchive, whose EXCLUDED_DIRS
strips node_modules — correct for SDK source trees, fatal for a runnable MCP
server, which then crashes on spawn (ERR_MODULE_NOT_FOUND) and fails the
client handshake. Add an `includeAll` option to getSourceArchive /
uploadDirToSandbox and use it for MCP server uploads. Verified in a real
microVM: the web-mcp handshake now completes with all tools.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Post-review cleanup on the MCP server feature, addressing findings from a
multi-agent review pass (code-reuse / code-quality / efficiency):

- Parallelize per-server uploads in ClaudeAdapter and CodexAdapter
  installMcpServersInSandbox — was a serial for-await loop, now Promise.all.
  N-server installs no longer pay N× upload latency.
- Run the executorPlugins and executorMcpServers install blocks in execute.ts
  concurrently via Promise.all — they write to disjoint sandbox dirs and
  disjoint adapter state.
- Extract src/sandbox/mcp.ts as the home for MCP-server install primitives:
  MCP_ROOT_PLACEHOLDER constant, substituteMcpRoot helper, and
  uploadMcpServerSources (the shared upload + ${MCP_ROOT}-substitution loop).
  Claude / Codex adapters now own only their CLI-specific serializer
  (JSON config vs TOML append) and writer.
- Make ResolvedExecutorMcpServer a discriminated union (kind:'sourced'|'sourceless')
  instead of an optional hostDir. Consumers narrow at compile time rather than
  probing optional fields.
- Lift MCP_ROOT_PLACEHOLDER out of config.ts so config validation and the
  adapters' substitution use the same literal.

No behaviour change beyond the parallelization. Type-check clean; 368/368
tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per review: getSourceArchive / uploadDirToSandbox were quietly carrying two
contracts via the includeAll flag — SDK source upload (strips node_modules,
build outputs, large binaries) vs runnable MCP server payload (verbatim
archive). Same function, two semantically different jobs. Confusing.

- Revert the includeAll axis on getSourceArchive and uploadDirToSandbox; both
  are now strictly for SDK source uploads (and plugin installs).
- Move MCP payload upload into a dedicated function pair in sandbox/mcp.ts:
  tarMcpServerSource (verbatim tar, separate process-scoped cache) and
  uploadMcpServerPayload (tar + upload + extract). MCP module owns its own
  mechanics rather than reaching into scaffolding's contract.
- Add a focused unit test for substituteMcpRoot.
- Adapter tests now mock uploadMcpServerSources at the import boundary
  (rather than the underlying payload uploader) since vitest partial mocks
  don't intercept intra-module references. Substitution behaviour is
  covered by the new mcp.test.ts; orchestration logic is covered at the
  adapter-test boundary.

Type-check clean; 372/372 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nickwinder nickwinder self-assigned this May 21, 2026
Smoke-testing against a node:20-slim microsandbox VM showed both MCP install
paths producing //.mcp-servers/... and //.codex/.mcp-servers/... — because
$HOME=/ in that image, which slips past the ${HOME:-/root} shell fallback
(HOME is set, just degenerate).

Catch it on the Node side: collapse repeated slashes, then treat a bare /
(claude) or / or /.codex (codex, after the inner :- fallback appends /.codex
to /) as "no real home" and fall back to /root or /root/.codex respectively.

Linux collapses // at the syscall layer so installs technically still worked,
but the paths showed up verbatim in mcp-config.json args and config.toml
[mcp_servers.*] blocks, which is ugly and could confuse the MCP client.

Scoped to the MCP install path only — the equivalent $HOME probe in the
plugin install path (shipped via PR #3) has the same flaw and should be
fixed in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nickwinder nickwinder marked this pull request as ready for review May 21, 2026 18:14
@nickwinder nickwinder requested a review from HungKNguyen May 21, 2026 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant