F099: MCP Proxy — Interception et contrôle des tool calls

# F099: MCP Proxy — Tool call interception + plugin tool exposure

> **Note**: simplified scope — focus on 2 priorities. Everything else (policy, approval, cache, recorder/playback, middleware, external MCP-as-Plugin, snapshot, virtual tools) is explicitly deferred.

## Objective

Intercept tool calls of all 5 supported agent providers (Claude, Gemini, Codex, OpenCode, OpenAI Compatible) by routing them through an AWF-controlled local MCP server, and let existing AWF gRPC plugins create new tools that those agents can call.

This is a deliberately reduced scope: it ships the two capabilities listed above and nothing else. Everything that the original F099 layered on top of those primitives (policy, approval, cache, recorder, middleware, external MCP servers, snapshot isolation, virtual tools) is **explicitly out of scope** and tracked separately.

## Decisions

| Decision | Choice | Alternative considered | Trade-off |
|----------|--------|------------------------|-----------|
| Interception mode | Active proxy — AWF re-exposes built-ins, becomes the sole tool source | Passive NDJSON observation; additive-only (plugin tools added next to native built-ins) | Active proxy is the only way to actually control what the agent calls; additive mode is preserved as an opt-in per step (see `intercept_builtins: false`) |
| Provider coverage | All 5 (Claude, Gemini, Codex, OpenCode, OpenAI Compatible) | Claude only; Claude + OpenAI Compatible only | Full coverage validates the abstraction across the 2 fundamentally different mechanisms (CLI subprocess vs HTTP native `tools[]`); Codex/OpenCode accept coexistence mode |
| Plugin tool exposure | Explicit per-step declaration with `plugin_tools[].expose: [...]` | Implicit (all plugins, all ops); plugin-level toggle | Aligns with AWF "explicit > implicit" philosophy; keeps `tools/list` minimal per step |
| Built-in interception toggle | `intercept_builtins: true` (default) + opt-out | Always intercept; always additive | Knob covers both "full control" and "just add plugin tools" use cases with one if-statement of extra code |
| Observability | OTel spans + zap logging | Recorder JSONL + EventBus events + spans | Spans + logging are zero-cost when telemetry is disabled and reuse existing infrastructure; recorder is its own feature (needs playback consumer) |
| External MCP server bridging | Out of scope | Bridge external MCP servers (GitHub, Postgres) as plugins via `type: mcp` | Different feature (subprocess lifecycle, handshake, schema mapping); tracked as a future F |
| Policy / approval / cache / middleware / recorder | Out of scope | Ship one or more in v1 | None of these block the two stated priorities; all can be added later behind ports without breaking the v1 schema |

## In Scope

- Local MCP server (stdio JSON-RPC 2.0) injected into agent CLIs as the sole tool source via per-provider mechanisms
- Six built-in tools re-implemented and re-exposed by the proxy: `Read`, `Write`, `Edit`, `Bash`, `Glob`, `Grep`
- Per-provider built-in disablement (Claude, Gemini full; Codex, OpenCode coexistence with startup warning)
- Native HTTP `tools[]` interception for OpenAI Compatible (extension of `chatCompletionsRequest`, `role: tool` messages, multi-turn loop, SSE delta assembly, infinite-loop guard)
- Plugin Bridge: existing gRPC `OperationProvider` exposed as MCP tools via adapter, with `OperationSchema → JSON Schema` mapping
- Per-step subprocess lifecycle (start `awf mcp-serve`, graceful shutdown on step end / failure / SIGINT)
- Tool name namespacing for plugin tools (`<plugin>_<op>`) with collision detection at step startup
- YAML schema `mcp_proxy:` block (4 keys total) and `awf validate` rules with stable error codes `USER.MCP_PROXY.*`
- OpenTelemetry spans per tool call (child of step span) and zap log line per tool call
- Hexagonal architecture compliance: domain port `ToolProvider`, application services, infrastructure adapters; `.go-arch-lint.yml` updated

## Out of Scope (explicit non-goals)

The following items are **not** delivered. Each is independently addable later without breaking the v1 schema or architecture:

- Policy Engine (allow/deny lists, filesystem sandboxing)
- Human-in-the-loop approval (`approval: always` / pattern)
- Content-addressed result cache (path + mtime + size keying, TTL, invalidation)
- Tool call recorder (JSONL append-only) and `awf playback` command
- Composable middleware chain (redact_secrets, truncate_large_files, rate_limit, inject_context)
- MCP-as-Plugin bridging (external MCP servers like `@modelcontextprotocol/server-github` registered as `type: mcp` plugins)
- Bypass detection via NDJSON output parsing for Codex/OpenCode coexistence
- EventBus events (`tool.call.start/end/denied/bypassed`)
- Snapshot isolation (CoW filesystem overlay) for parallel steps
- Virtual composite tools (pipelines with rollback)

## Architecture

```
┌─────────────────────────────────────────────────────────────┐
│ INTERFACES (cli + YAML)                                     │
│   - YAML block  mcp_proxy: { enable, intercept_builtins,    │
│                              plugin_tools }                 │
│   - awf validate runs the block validation                  │
│   - Internal command `awf mcp-serve --config=<path>`        │
└──────────────────────────┬──────────────────────────────────┘
                           │
┌──────────────────────────┴──────────────────────────────────┐
│ APPLICATION                                                 │
│   - ToolProxyService : orchestrates the per-step lifecycle  │
│   - ToolRouter       : aggregates ToolProviders, routes     │
│                        by name, detects collisions          │
└──────────────────────────┬──────────────────────────────────┘
                           │
┌──────────────────────────┴──────────────────────────────────┐
│ DOMAIN — ports                                              │
│   - ToolProvider : ListTools(), CallTool(name, args), Close │
│   - (NO ToolPolicy / ToolMiddleware / ToolCache)            │
└──────────────────────────┬──────────────────────────────────┘
                           │
┌──────────────────────────┴──────────────────────────────────┐
│ INFRASTRUCTURE                                              │
│   - pkg/mcpserver       : reusable MCP server (stdio,       │
│                           zero internal/ imports, NFR-005)  │
│   - BuiltinToolProvider : Read/Write/Edit/Bash/Glob/Grep    │
│   - PluginToolAdapter   : OperationProvider → ToolProvider  │
│   - Provider injection  : buildExecuteArgs extension for    │
│                           Claude/Gemini/Codex/OpenCode      │
│                           + chatCompletionsRequest for      │
│                           OpenAI Compatible                 │
└─────────────────────────────────────────────────────────────┘
```

**Key invariants:**

- One MCP server **per step** (lifetime bound to step, graceful shutdown via `defer`)
- The MCP server runs as a separate subprocess (`awf mcp-serve --config=<tmpfile>`) for stdio providers (Claude/Gemini/Codex/OpenCode); for OpenAI Compatible there is no subprocess — `ToolRouter` is invoked in-process by the HTTP provider
- Each `CallTool` opens a child OTel span of the current step span; attributes: tool name, source (`builtin` / `plugin:<name>`), duration, error
- One zap log line per tool call; zero persistent storage

## Components

### 1. `pkg/mcpserver` — Reusable MCP Server

Standalone package, **zero imports from `internal/`** (preserves NFR-005). Implements the stable MCP subset: `initialize`, `initialized`, `tools/list`, `tools/call`, `shutdown`.

```go
package mcpserver

type Server struct { /* ... */ }

func New() *Server
func (s *Server) RegisterTool(name string, schema InputSchema, handler ToolHandler)
func (s *Server) Serve(ctx context.Context, stdin io.Reader, stdout io.Writer) error

type ToolHandler func(ctx context.Context, args json.RawMessage) (Result, error)
type InputSchema struct { /* JSON Schema document */ }
type Result struct {
    Content []ContentBlock
    IsError bool
}
```

Out of scope for v1: `notifications/progress`, prompts, resources, sampling.

### 2. Domain port — `internal/domain/ports/tool_provider.go`

```go
type ToolProvider interface {
    ListTools(ctx context.Context) ([]ToolDefinition, error)
    CallTool(ctx context.Context, name string, args map[string]any) (*ToolResult, error)
    Close(ctx context.Context) error
}

type ToolDefinition struct {
    Name        string
    Description string
    InputSchema map[string]any // JSON Schema
    Source      string         // "builtin" | "plugin:<plugin_name>"
}

type ToolResult struct {
    Content []ToolContent
    IsError bool
}
```

No `ToolPolicy`, `ToolMiddleware`, `ToolCache` ports are introduced in v1.

### 3. Infrastructure adapters

| Adapter | Location | Responsibility |
|---|---|---|
| `BuiltinToolProvider` | `internal/infrastructure/tools/builtins/` | Implements `Read`, `Write`, `Edit`, `Bash`, `Glob`, `Grep`. Uses the existing `Executor` for `Bash`; `os`/`filepath` helpers for the file ops. No filesystem sandboxing (out of scope). |
| `PluginToolAdapter` | `internal/infrastructure/tools/plugin_adapter.go` | Wraps a `ports.OperationProvider`. For each op listed in `expose:`, maps `OperationSchema → InputSchema` (JSON Schema). Prefixes tool names with `<plugin_name>_`. |

### 4. Application services

| Service | Location | Responsibility |
|---|---|---|
| `ToolRouter` | `internal/application/tools/router.go` | Aggregates multiple `ToolProvider`s. Builds the consolidated `tools/list`. Routes `tools/call` by name. Detects collisions at registration (fatal at step startup, not runtime). Wraps each call with OTel span and zap log. |
| `ToolProxyService` | `internal/application/tools/proxy_service.go` | Per-step coordinator: reads `mcp_proxy:` config, instantiates `ToolProvider`s, builds the `ToolRouter`, spawns `awf mcp-serve` (for stdio providers) or hands the router to the HTTP provider (for OpenAI Compatible), returns the provider-specific config payload, shuts everything down on step end. |

### 5. Internal CLI command — `awf mcp-serve`

Not exposed in user help. Launched as a subprocess by `ToolProxyService`. Takes `--config=<path>` pointing to a tmp file describing the tools to expose (built-ins flag + plugin_tools list). Starts an `mcpserver.Server`, registers the tool handlers, calls `Serve()` on stdin/stdout.

### 6. Provider injection extensions

Each provider's `buildExecuteArgs` (or HTTP request builder for OpenAI Compatible) is extended to inject the proxy when `mcp_proxy.enable: true`. See [Per-Provider Injection](#per-provider-injection) below for full flag tables.

### 7. OTel + Logging (cross-cutting)

Wired in `ToolRouter.CallTool`:

```go
ctx, span := tracer.Start(ctx, "tool.call." + name)
defer span.End()
span.SetAttributes(
    attribute.String("tool.name", name),
    attribute.String("tool.source", source),
)

start := time.Now()
result, err := provider.CallTool(ctx, name, args)
duration := time.Since(start)

logger.Info("tool call",
    zap.String("tool", name),
    zap.String("source", source),
    zap.Duration("duration", duration),
    zap.Error(err),
)

span.SetAttributes(attribute.Int64("tool.duration_ms", duration.Milliseconds()))
if err != nil {
    span.RecordError(err)
}
```

Zero cost when no telemetry exporter is configured (existing AWF behavior).

## YAML Schema

### Grammar

```yaml
mcp_proxy:
  enable: bool                    # default: false. Activates the proxy on this step.
  intercept_builtins: bool        # default: true. If false → native built-ins stay active,
                                  #                 proxy only adds plugin_tools.
  plugin_tools:                   # optional. Plugins to expose.
    - plugin: string              # name from .awf/plugins.yaml
      expose: [string, ...]       # operations to expose as MCP tools
```

### Examples

**Case 1 — proxy unused (default, backwards compatible)**

```yaml
states:
  refactor:
    type: step
    agent:
      provider: claude
      prompt: "Refactor src/foo.go"
    # no mcp_proxy: → identical behavior to today, native built-ins.
```

**Case 2 — full interception, built-ins only (pure observability)**

```yaml
states:
  refactor:
    type: step
    agent:
      provider: claude
      prompt: "Refactor src/foo.go"
    mcp_proxy:
      enable: true
      # Read/Write/Edit/Bash/Glob/Grep re-exposed by AWF.
      # No plugin_tools → only the 6 built-ins.
      # Use case: zap logging + OTel spans on every FS/shell op the agent performs.
```

**Case 3 — full interception + plugin tools**

```yaml
states:
  deploy:
    type: step
    agent:
      provider: claude
      prompt: "Apply the new k8s manifest"
    mcp_proxy:
      enable: true
      plugin_tools:
        - plugin: kubernetes
          expose: [kubectl_apply, kubectl_get]
      # Agent sees: Read, Write, Edit, Bash, Glob, Grep,
      #             kubernetes_kubectl_apply, kubernetes_kubectl_get
```

**Case 4 — additive proxy (native built-ins intact, plugin tools added)**

```yaml
states:
  deploy:
    type: step
    agent:
      provider: claude
      prompt: "Apply the new k8s manifest"
    mcp_proxy:
      enable: true
      intercept_builtins: false
      plugin_tools:
        - plugin: kubernetes
          expose: [kubectl_apply]
      # Agent sees: its NATIVE Read/Write/Edit/Bash/Glob/Grep +
      #             kubernetes_kubectl_apply (via AWF).
      # OTel/logging only on kubernetes_kubectl_apply.
```

### Validation rules (`awf validate`)

| Error code | Condition |
|---|---|
| `USER.MCP_PROXY.UNKNOWN_KEY` | Unknown key in the `mcp_proxy:` block (typo, future schema, unsupported sub-key) |
| `USER.MCP_PROXY.UNKNOWN_PLUGIN` | `plugin_tools[].plugin` does not match any plugin declared in `.awf/plugins.yaml` |
| `USER.MCP_PROXY.UNKNOWN_OPERATION` | `plugin_tools[].expose[]` references an operation the plugin does not expose |
| `USER.MCP_PROXY.NAME_COLLISION` | Two tools (built-in or plugin, after namespacing) resolve to the same name |
| `USER.MCP_PROXY.EMPTY_PROXY` | `enable: true` + `intercept_builtins: false` + empty/missing `plugin_tools` → effective no-op, explicit error to flag the dead config |
| `USER.MCP_PROXY.UNSUPPORTED_PROVIDER` | (warn only) Step uses Codex or OpenCode — logs a startup warning about coexistence mode |

## Per-Provider Injection

### Mode `intercept_builtins: true` (default)

| Provider | Flags / mechanism | MCP-only isolation |
|---|---|---|
| **Claude** | `--mcp-config <path>` + `--tools ""` + `--strict-mcp-config` | Guaranteed |
| **Gemini** | `--mcp-server awf-proxy=<cmd>` + `--allowed-mcp-server-names awf-proxy` + `-e ""` *(fallback `--policy <deny-all-path>` if `-e ""` does not fully disable extensions)* | Validation in Phase 3 |
| **Codex** | `-c 'mcp_servers.awf-proxy.command="<path>"'` + `-c 'mcp_servers.awf-proxy.args=[...]'` + `-s read-only` + system prompt mitigation ("Use only MCP tools, never built-in tools") | Coexistence — built-ins remain accessible. Startup warning emitted. |
| **OpenCode** | `opencode mcp add awf-proxy -- <cmd>` (persistent config, applied before exec) + system prompt mitigation | Coexistence — same as Codex. Startup warning emitted. |
| **OpenAI Compatible** | No CLI flags. Native mechanism: `chatCompletionsRequest.tools[]` carries the 6 built-ins + plugin tools, `tool_choice: "auto"`, `role: "tool"` messages with `tool_call_id`, multi-turn execution loop, SSE delta assembly for `tool_calls`. Loop guard: `len(tool_calls) == 0 && finish_reason == "tool_calls"` → structured error. | Guaranteed (AWF is the HTTP client) |

### Mode `intercept_builtins: false`

Native built-ins remain active; the MCP server is injected **alongside**, carrying only the plugin tools.

| Provider | Difference vs. full-interception mode |
|---|---|
| **Claude** | Drop `--tools ""` and `--strict-mcp-config`. Keep only `--mcp-config <path>`. |
| **Gemini** | Drop `-e ""` and `--allowed-mcp-server-names`. Keep only `--mcp-server awf-proxy=<cmd>`. |
| **Codex** | Identical to full-interception (no `--tools` flag to omit — full-mode was already coexistence). Drop system prompt mitigation. |
| **OpenCode** | Same as Codex. |
| **OpenAI Compatible** | `chatCompletionsRequest.tools[]` only carries plugin tools (no built-ins). |

### Subprocess lifecycle (Claude / Gemini / Codex / OpenCode)

```
ToolProxyService.Start(step) {
  1. Build config file (tmp): describes tools to expose
  2. Spawn `awf mcp-serve --config=<tmpfile>` as subprocess
  3. Generate provider-specific MCP config (.json for Claude, etc.)
  4. Return: (mcpConfigPath, cleanupFunc)
}

→ Agent CLI invoked with injected flags pointing to mcpConfigPath
→ Agent connects via stdio to awf mcp-serve subprocess
→ Agent issues tools/list and tools/call via JSON-RPC

ToolProxyService.Close(step) {
  1. Send shutdown to mcp-serve subprocess (SIGTERM)
  2. Wait max 5s for graceful exit
  3. SIGKILL if still alive
  4. Remove tmpfile
}
```

For OpenAI Compatible: no subprocess. The `ToolRouter` is invoked directly in-process by the HTTP provider during its multi-turn loop.

### Startup warning for Codex / OpenCode

When a step launches with `intercept_builtins: true` on Codex or OpenCode, log via zap at `WARN`:

```
WARN: mcp_proxy on provider=codex runs in coexistence mode.
      Built-in tools cannot be disabled and may bypass the proxy.
      Use 'claude' or 'openai-compatible' for guaranteed MCP-only isolation.
```

The user accepts this trade-off implicitly by choosing the provider — no additional opt-in.

## Phasing

| Phase | Deliverable | Effort estimate |
|---|---|---|
| **1 — Foundation + Claude** | `pkg/mcpserver`, `ToolProvider` port, `BuiltinToolProvider`, `ToolRouter`, `ToolProxyService`, `awf mcp-serve` command, Claude injection, YAML schema + validation (codes `UNKNOWN_KEY` and `EMPTY_PROXY`), `intercept_builtins` knob, OTel + logging, `.go-arch-lint.yml` update. End-to-end: a Claude step exercises the 6 built-ins via the proxy. | ~1-2 weeks |
| **2 — Plugin Bridge** | `PluginToolAdapter`, `OperationSchema → JSON Schema` mapping, namespacing `<plugin>_<op>`, collision detection, YAML `plugin_tools:` support, validation codes `UNKNOWN_PLUGIN`, `UNKNOWN_OPERATION`, `NAME_COLLISION`. End-to-end: a Claude step exposes a gRPC plugin's operation as an MCP tool. | ~3-5 days |
| **3 — Multi-provider stdio** | Gemini injection (with `-e ""` validation + `--policy` fallback), Codex injection (coexistence + prompt mitigation), OpenCode injection (`opencode mcp add` + prompt mitigation), startup warning for Codex/OpenCode, validation code `UNSUPPORTED_PROVIDER`. | ~1 week |
| **4 — OpenAI Compatible native tools[]** | `chatCompletionsRequest` extension (`tools[]`, `tool_choice`), `role: "tool"` message support, multi-turn execution loop, SSE delta assembly for `tool_calls`, infinite-loop guard. Reuses `ToolRouter` directly (no subprocess). | ~1 week |

**Total**: ~4-5 weeks for one full-time engineer. Size: **L**.

**Dependencies**: Phases 2, 3, and 4 all depend on Phase 1. Phases 2, 3, and 4 are independent of each other and may be parallelized.

**MVP**: Phases 1 + 2 deliver both priorities on Claude in ~2 weeks (size **M**). The full scope is committed but a partial cut is shippable.

## Acceptance Criteria

| ID | Criterion |
|---|---|
| AC-1 | A step with `mcp_proxy.enable: true` on Claude, Gemini, or OpenAI Compatible exercises `Read` and `Bash` exclusively through `awf mcp-serve` (or in-process `ToolRouter` for OpenAI Compatible). OTel span is emitted as child of the step span. Zap log line is written. |
| AC-2 | A step with `intercept_builtins: false` + `plugin_tools: [{plugin: P, expose: [op]}]` results in the agent seeing native built-ins + the namespaced plugin tool. The plugin tool call dispatches to `OperationProvider.Execute(op, args)`. |
| AC-3 | A name collision between two plugin tools, or between a plugin tool and a built-in, fails at step startup with `USER.MCP_PROXY.NAME_COLLISION`. Runtime collision is impossible. |
| AC-4 | `awf validate` rejects `mcp_proxy:` blocks with unknown keys (`UNKNOWN_KEY`), unknown plugins (`UNKNOWN_PLUGIN`), unknown operations (`UNKNOWN_OPERATION`), and dead configs (`EMPTY_PROXY`). |
| AC-5 | `Ctrl+C` during a step with the proxy active terminates the `awf mcp-serve` subprocess cleanly (no zombies, verified by `pgrep` integration test). |
| AC-6 | A Codex or OpenCode step with `intercept_builtins: true` logs the expected coexistence warning at startup. |
| AC-7 | An OpenAI Compatible step with `tools[]` returning zero tool calls and `finish_reason: "tool_calls"` errors out instead of looping. |
| AC-8 | `make build`, `make lint`, `make lint-arch`, `make test`, `make test-race` all pass with zero violations. |
| AC-9 | `pkg/mcpserver` has zero imports from `internal/` (verified by `make lint-arch`). |

## Risks

| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Gemini `-e ""` does not disable extensions as documented | Medium | Medium | Fallback to `--policy <deny-all-path>` validated in Phase 3 before merge |
| Codex / OpenCode prompt mitigation insufficient to prevent native built-in use | High | Low | Accepted trade-off; documented; warning emitted; users in sensitive contexts use Claude or OpenAI Compatible |
| OpenAI Compatible SSE delta assembly subtly broken for multi-chunk `tool_calls` arguments | Medium | High | Integration test with a tool whose args span 2+ chunks; loop guard prevents infinite loop |
| `OperationSchema → JSON Schema` mapping loses information (e.g., complex types, defaults) | Medium | Medium | Phase 2 lands with a curated mapping; unsupported features error explicitly at registration rather than silently dropping |
| Subprocess `awf mcp-serve` orphaned after parent crash | Low | Medium | SIGTERM with 5s timeout then SIGKILL; integration test with `pgrep` verifies no orphans on `Ctrl+C` |

## Future Work (explicitly deferred)

Each item below can be added behind the existing `ToolProvider` port and YAML schema without breaking changes:

- Policy Engine (allow/deny, filesystem sandboxing) — adds a `ToolPolicy` port wrapping `ToolRouter`
- Human-in-the-loop approval — extension of Policy Engine
- Content-addressed result cache — adds a `ToolCache` decorator around `ToolProvider`
- Tool call recorder (JSONL) + `awf playback <id>` command
- Composable middleware chain — adds a `ToolMiddleware` port; chain composed at `ToolRouter` level
- MCP-as-Plugin (external MCP servers as plugins via `type: mcp` in plugin config)
- Bypass detection via NDJSON parsing for Codex/OpenCode
- EventBus events (`tool.call.start/end/denied/bypassed`)
- Snapshot isolation (CoW filesystem overlay) for parallel steps
- Virtual composite tools (pipelines with rollback)

## Metadata

- **Status**: backlog
- **Version**: v0.10.0
- **Priority**: high
- **Estimation**: L (was XL)

## Dependencies

- **Blocked by**: none (gRPC plugin system C066–C069 is a prerequisite for Phase 2 but is already implemented)
- **Unblocks**: future Policy/Cache/Recorder/Middleware features (all designed to plug behind the `ToolProvider` port without breaking changes)


Adapter	Location	Responsibility
`BuiltinToolProvider`	`internal/infrastructure/tools/builtins/`	Implements `Read`, `Write`, `Edit`, `Bash`, `Glob`, `Grep`. Uses the existing `Executor` for `Bash`; `os`/`filepath` helpers for the file ops. No filesystem sandboxing (out of scope).
`PluginToolAdapter`	`internal/infrastructure/tools/plugin_adapter.go`	Wraps a `ports.OperationProvider`. For each op listed in `expose:`, maps `OperationSchema → InputSchema` (JSON Schema). Prefixes tool names with `<plugin_name>_`.

Service	Location	Responsibility
`ToolRouter`	`internal/application/tools/router.go`	Aggregates multiple `ToolProvider`s. Builds the consolidated `tools/list`. Routes `tools/call` by name. Detects collisions at registration (fatal at step startup, not runtime). Wraps each call with OTel span and zap log.
`ToolProxyService`	`internal/application/tools/proxy_service.go`	Per-step coordinator: reads `mcp_proxy:` config, instantiates `ToolProvider`s, builds the `ToolRouter`, spawns `awf mcp-serve` (for stdio providers) or hands the router to the HTTP provider (for OpenAI Compatible), returns the provider-specific config payload, shuts everything down on step end.

Error code	Condition
`USER.MCP_PROXY.UNKNOWN_KEY`	Unknown key in the `mcp_proxy:` block (typo, future schema, unsupported sub-key)
`USER.MCP_PROXY.UNKNOWN_PLUGIN`	`plugin_tools[].plugin` does not match any plugin declared in `.awf/plugins.yaml`
`USER.MCP_PROXY.UNKNOWN_OPERATION`	`plugin_tools[].expose[]` references an operation the plugin does not expose
`USER.MCP_PROXY.NAME_COLLISION`	Two tools (built-in or plugin, after namespacing) resolve to the same name
`USER.MCP_PROXY.EMPTY_PROXY`	`enable: true` + `intercept_builtins: false` + empty/missing `plugin_tools` → effective no-op, explicit error to flag the dead config
`USER.MCP_PROXY.UNSUPPORTED_PROVIDER`	(warn only) Step uses Codex or OpenCode — logs a startup warning about coexistence mode

Provider	Flags / mechanism	MCP-only isolation
Claude	`--mcp-config <path>` + `--tools ""` + `--strict-mcp-config`	Guaranteed
Gemini	`--mcp-server awf-proxy=<cmd>` + `--allowed-mcp-server-names awf-proxy` + `-e ""` (fallback `--policy <deny-all-path>` if `-e ""` does not fully disable extensions)	Validation in Phase 3
Codex	`-c 'mcp_servers.awf-proxy.command="<path>"'` + `-c 'mcp_servers.awf-proxy.args=[...]'` + `-s read-only` + system prompt mitigation ("Use only MCP tools, never built-in tools")	Coexistence — built-ins remain accessible. Startup warning emitted.
OpenCode	`opencode mcp add awf-proxy -- <cmd>` (persistent config, applied before exec) + system prompt mitigation	Coexistence — same as Codex. Startup warning emitted.
OpenAI Compatible	No CLI flags. Native mechanism: `chatCompletionsRequest.tools[]` carries the 6 built-ins + plugin tools, `tool_choice: "auto"`, `role: "tool"` messages with `tool_call_id`, multi-turn execution loop, SSE delta assembly for `tool_calls`. Loop guard: `len(tool_calls) == 0 && finish_reason == "tool_calls"` → structured error.	Guaranteed (AWF is the HTTP client)

Provider	Difference vs. full-interception mode
Claude	Drop `--tools ""` and `--strict-mcp-config`. Keep only `--mcp-config <path>`.
Gemini	Drop `-e ""` and `--allowed-mcp-server-names`. Keep only `--mcp-server awf-proxy=<cmd>`.
Codex	Identical to full-interception (no `--tools` flag to omit — full-mode was already coexistence). Drop system prompt mitigation.
OpenCode	Same as Codex.
OpenAI Compatible	`chatCompletionsRequest.tools[]` only carries plugin tools (no built-ins).

Decision	Choice	Alternative considered	Trade-off
Interception mode	Active proxy — AWF re-exposes built-ins, becomes the sole tool source	Passive NDJSON observation; additive-only (plugin tools added next to native built-ins)	Active proxy is the only way to actually control what the agent calls; additive mode is preserved as an opt-in per step (see `intercept_builtins: false`)
Provider coverage	All 5 (Claude, Gemini, Codex, OpenCode, OpenAI Compatible)	Claude only; Claude + OpenAI Compatible only	Full coverage validates the abstraction across the 2 fundamentally different mechanisms (CLI subprocess vs HTTP native `tools[]`); Codex/OpenCode accept coexistence mode
Plugin tool exposure	Explicit per-step declaration with `plugin_tools[].expose: [...]`	Implicit (all plugins, all ops); plugin-level toggle	Aligns with AWF "explicit > implicit" philosophy; keeps `tools/list` minimal per step
Built-in interception toggle	`intercept_builtins: true` (default) + opt-out	Always intercept; always additive	Knob covers both "full control" and "just add plugin tools" use cases with one if-statement of extra code
Observability	OTel spans + zap logging	Recorder JSONL + EventBus events + spans	Spans + logging are zero-cost when telemetry is disabled and reuse existing infrastructure; recorder is its own feature (needs playback consumer)
External MCP server bridging	Out of scope	Bridge external MCP servers (GitHub, Postgres) as plugins via `type: mcp`	Different feature (subprocess lifecycle, handshake, schema mapping); tracked as a future F
Policy / approval / cache / middleware / recorder	Out of scope	Ship one or more in v1	None of these block the two stated priorities; all can be added later behind ports without breaking the v1 schema

Phase	Deliverable	Effort estimate
1 — Foundation + Claude	`pkg/mcpserver`, `ToolProvider` port, `BuiltinToolProvider`, `ToolRouter`, `ToolProxyService`, `awf mcp-serve` command, Claude injection, YAML schema + validation (codes `UNKNOWN_KEY` and `EMPTY_PROXY`), `intercept_builtins` knob, OTel + logging, `.go-arch-lint.yml` update. End-to-end: a Claude step exercises the 6 built-ins via the proxy.	~1-2 weeks
2 — Plugin Bridge	`PluginToolAdapter`, `OperationSchema → JSON Schema` mapping, namespacing `<plugin>_<op>`, collision detection, YAML `plugin_tools:` support, validation codes `UNKNOWN_PLUGIN`, `UNKNOWN_OPERATION`, `NAME_COLLISION`. End-to-end: a Claude step exposes a gRPC plugin's operation as an MCP tool.	~3-5 days
3 — Multi-provider stdio	Gemini injection (with `-e ""` validation + `--policy` fallback), Codex injection (coexistence + prompt mitigation), OpenCode injection (`opencode mcp add` + prompt mitigation), startup warning for Codex/OpenCode, validation code `UNSUPPORTED_PROVIDER`.	~1 week
4 — OpenAI Compatible native tools[]	`chatCompletionsRequest` extension (`tools[]`, `tool_choice`), `role: "tool"` message support, multi-turn execution loop, SSE delta assembly for `tool_calls`, infinite-loop guard. Reuses `ToolRouter` directly (no subprocess).	~1 week

ID	Criterion
AC-1	A step with `mcp_proxy.enable: true` on Claude, Gemini, or OpenAI Compatible exercises `Read` and `Bash` exclusively through `awf mcp-serve` (or in-process `ToolRouter` for OpenAI Compatible). OTel span is emitted as child of the step span. Zap log line is written.
AC-2	A step with `intercept_builtins: false` + `plugin_tools: [{plugin: P, expose: [op]}]` results in the agent seeing native built-ins + the namespaced plugin tool. The plugin tool call dispatches to `OperationProvider.Execute(op, args)`.
AC-3	A name collision between two plugin tools, or between a plugin tool and a built-in, fails at step startup with `USER.MCP_PROXY.NAME_COLLISION`. Runtime collision is impossible.
AC-4	`awf validate` rejects `mcp_proxy:` blocks with unknown keys (`UNKNOWN_KEY`), unknown plugins (`UNKNOWN_PLUGIN`), unknown operations (`UNKNOWN_OPERATION`), and dead configs (`EMPTY_PROXY`).
AC-5	`Ctrl+C` during a step with the proxy active terminates the `awf mcp-serve` subprocess cleanly (no zombies, verified by `pgrep` integration test).
AC-6	A Codex or OpenCode step with `intercept_builtins: true` logs the expected coexistence warning at startup.
AC-7	An OpenAI Compatible step with `tools[]` returning zero tool calls and `finish_reason: "tool_calls"` errors out instead of looping.
AC-8	`make build`, `make lint`, `make lint-arch`, `make test`, `make test-race` all pass with zero violations.
AC-9	`pkg/mcpserver` has zero imports from `internal/` (verified by `make lint-arch`).

Risk	Likelihood	Impact	Mitigation
Gemini `-e ""` does not disable extensions as documented	Medium	Medium	Fallback to `--policy <deny-all-path>` validated in Phase 3 before merge
Codex / OpenCode prompt mitigation insufficient to prevent native built-in use	High	Low	Accepted trade-off; documented; warning emitted; users in sensitive contexts use Claude or OpenAI Compatible
OpenAI Compatible SSE delta assembly subtly broken for multi-chunk `tool_calls` arguments	Medium	High	Integration test with a tool whose args span 2+ chunks; loop guard prevents infinite loop
`OperationSchema → JSON Schema` mapping loses information (e.g., complex types, defaults)	Medium	Medium	Phase 2 lands with a curated mapping; unsupported features error explicitly at registration rather than silently dropping
Subprocess `awf mcp-serve` orphaned after parent crash	Low	Medium	SIGTERM with 5s timeout then SIGKILL; integration test with `pgrep` verifies no orphans on `Ctrl+C`

Uh oh!

F099: MCP Proxy — Interception et contrôle des tool calls #353

Description

F099: MCP Proxy — Tool call interception + plugin tool exposure

Objective

Decisions

In Scope

Out of Scope (explicit non-goals)

Architecture

Components

1. pkg/mcpserver — Reusable MCP Server

2. Domain port — internal/domain/ports/tool_provider.go

3. Infrastructure adapters

4. Application services

5. Internal CLI command — awf mcp-serve

6. Provider injection extensions

7. OTel + Logging (cross-cutting)

YAML Schema

Grammar

Examples

Validation rules (awf validate)

Per-Provider Injection

Mode intercept_builtins: true (default)

Mode intercept_builtins: false

Subprocess lifecycle (Claude / Gemini / Codex / OpenCode)

Startup warning for Codex / OpenCode

Phasing

Acceptance Criteria

Risks

Future Work (explicitly deferred)

Metadata

Dependencies

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. `pkg/mcpserver` — Reusable MCP Server

2. Domain port — `internal/domain/ports/tool_provider.go`

5. Internal CLI command — `awf mcp-serve`

Validation rules (`awf validate`)

Mode `intercept_builtins: true` (default)

Mode `intercept_builtins: false`