Skip to content

test(cockpit): aimock e2e — c-subagents (Phase 3)#364

Merged
blove merged 9 commits into
mainfrom
claude/cockpit-aimock-c-subagents
May 16, 2026
Merged

test(cockpit): aimock e2e — c-subagents (Phase 3)#364
blove merged 9 commits into
mainfrom
claude/cockpit-aimock-c-subagents

Conversation

@blove
Copy link
Copy Markdown
Contributor

@blove blove commented May 16, 2026

Summary

Adds a per-example aimock e2e for `c-subagents` (orchestrator LLM with a `task` tool that dispatches subagents). Third per-example spec under the harness library landed in Phase 2 (#356).

What changed

  • New per-example dir at `cockpit/chat/subagents/angular/e2e/` with config, fixture, capture script, spec, tsconfig, .gitignore.
  • Per-example langgraph port 8125 (streaming=8123, tool-calls=8124, subagents=8125). Proxy.conf.json target updated to match.
  • Capture via aimock --record mode — proxies the real langgraph dev against real OpenAI through aimock, captures EVERY LLM call at the HTTP boundary. 9 fixture entries: orchestrator's 3 `task` tool_calls, each subagent's tool round-trip (`get_airport_info`, `find_routes`), and continuation. The previous direct-LLM-invoke approach (Phase 2 c-tool-calls pattern) couldn't capture the subagent LLM calls dispatched inside the `task` tool.
  • CI loop updated to include `cockpit-chat-subagents-angular`.

Sits on Phase 2 (#356) + c-* aviation refactor PR 1 (#347).

New capture pattern (reusable)

Future cockpit examples with nested LLM flows (c-interrupts when refactored, c-generative-ui dashboard) use the same aimock --record approach. Pattern: start aimock with `--record --provider-openai`, start langgraph dev with `OPENAI_BASE_URL` pointed at aimock, submit a run via the LangGraph SDK HTTP API, poll until success, aimock writes the fixture file.

Note: the underlying aimock CLI binary is `llmock` (legacy alias), not `aimock` (primary, requires `--config `). Both ship in the same npm package.

Spec assertion shape

Same as chat-aimock Phase 2d (research-subagent): asserts on the durable tool-call chip ("Called task" button rendered by chat-tool-calls primitive) + a content phrase from the captured continuation. Avoided `` because that primitive only renders while a subagent is in RUNNING state — once subagents complete (which is the state `sendPromptAndWait` returns at), the cards are filtered out of the DOM.

Test plan

  • Pilot spec passes 3/3 stability runs locally (2/3 in one attempt — port-collision flake addressed by CI's `retries: 2`)
  • No harness library changes (the Phase 2 library handles this scenario as-is)
  • Working tree clean
  • CI green on this PR

Spec: `docs/superpowers/specs/2026-05-16-cockpit-aimock-c-subagents-design.md`
Plan: `docs/superpowers/plans/2026-05-16-cockpit-aimock-c-subagents.md`

blove added 9 commits May 16, 2026 07:08
…oundary

First implementer attempt (Task 3 direct LLM invocation) failed because the
c-subagents `task` tool dispatches to subagent functions that each run
their own LLM-driven agent loops. Aimock 404s on the un-captured subagent
LLM calls at replay time.

Replace the direct-LLM-invocation capture with aimock --record mode: run
real langgraph dev against aimock proxying to real OpenAI. Captures every
LLM call in the full graph (orchestrator + each subagent + nested tool
sub-rounds) at the HTTP boundary. Reusable pattern for future multi-LLM
examples (c-interrupts, c-generative-ui dashboard).
…proach

Switch from a direct-LLM Python script to a shell script that proxies the
real langgraph dev server through aimock in --record mode. Captures every
LLM call uniformly (orchestrator + each subagent's nested calls + tool
sub-rounds) at the HTTP layer.

Fixture: 9 entries covering the orchestrator's three task dispatches plus
each subagent's tool round-trip (research/booking/itinerary).
@vercel
Copy link
Copy Markdown

vercel Bot commented May 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
cacheplane Ready Ready Preview, Comment May 16, 2026 4:32pm

Request Review

@blove blove merged commit e297d49 into main May 16, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant