feat(orchestration): docs/12 spec/planner/executor pipeline + audit fixes by devteapot · Pull Request #1 · devteapot/sloppy

devteapot · 2026-04-30T11:22:11Z

Summary

Implement docs/12 orchestration design: spec/planner/executor agent pipeline with autonomous coordinator, plan lifecycle, gates, drift detection, digests, precedents, goals, messages, and verification.
Add session provider/runtime integration, dashboard control surface, and TUI session view.
Address docs/12 audit fixes across descriptors, repository, budget, classifiers, and policy rules.
Extensive test coverage: docs12 orchestration, autonomous goal/executor, agent session provider, planning policy, and transition tests.

Testing

Not run (not requested)

Phase A of docs/15-executor-routing.md. Replaces the parallel `model?: string` / `executionMode?: string` shortcuts on the sub-agent spawn path with a single typed `executor?: ExecutorBinding` (`{kind:"llm",profileId,modelOverride?} | {kind:"acp",adapterId,timeoutMs?}`) resolved through one `ExecutorResolver`. Specialists, tasks, role defaults, overlays, and gates land in later phases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Tightens the docs/12 HITL substrate against an audit pass. Each fix targets a documented invariant the prior implementation did not enforce: - Plan revision accept supersedes prior-revision non-terminal tasks so completion gating reflects the active "complete slice set." - Plans can only reference accepted spec versions; drafts/active/archived are rejected at create_plan_revision and assertPlanSpecFresh. - Evidence claims validate criterion_id membership against the slice's acceptance_criteria; replayable kind requires at least one replayable ref; failing replayable checks (exit != 0) cannot satisfy criteria. - Gate acceptance now reverts to open if the resolution handler throws, closing the accepted-but-unapplied window. - generateDigest decrements activeGenerationCount in finally, fixing a liveness leak that disabled triggered digests until restart. - Tasks tagged with plan_revision_id; new listActiveRevisionTaskIds scopes final audit, /orchestration root, retry-budget, digest counts, and drift progress to the active revision (excluding superseded). - create_task retries inherit docs/12 fields and acceptance criteria from the source slice. - New observed_only_coverage warning drift event surfaces in digest near_misses; aggregates per-criterion across rows and consults prior claims; legacy record_verification path also runs through drift. - Legacy record_verification no longer treats skipped as criterion- satisfying. Tests: regression coverage added for each fix in tests/docs12-orchestration.test.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Wires LLM-driven role-scoped sub-agents on top of the docs/12 substrate. Phase 1 — autonomous executor: - executor_binding flows through PlanSliceInput / TaskDefinition / scheduler to delegation spawn, so plans can per-slice route LLM/ACP. - Scheduler tags every dispatched spawn with role: "executor" and the per-slice binding. - roleId plumbed end-to-end: spawn_agent affordance → DelegationAgentSpawn → SubAgentRunner → SessionRuntime (sub-agents now actually carry a role instead of running with the default). - EXECUTOR_PROMPT rewritten with the submit_evidence_claim contract, criterion-mapping rules, and hard rules (no spec/plan/goal authoring, no irreversibles without a gate). - Hub-layer executorRoleRule denies /specs, /goals, plan-revision, and delegation.spawn_agent for the executor role. Phase 2 — autonomous spec-agent and planner: - Goal.autonomous flag; create_goal accepts autonomous: true. - AutonomousGoalCoordinator watches /goals + /gates + /specs and spawns a spec-agent on autonomous goal creation; spawns the planner when the matching spec_accept gate is accepted. plan_accept → executor flow is unchanged. - SPEC_AGENT_PROMPT and PLANNER_PROMPT fleshed out with concrete contracts. specAgentRoleRule and plannerRoleRule enforce role boundaries (spec-agent can't author plans/evidence/goals; planner can't author specs/evidence/goals; neither mutates workspace files). Tests: - tests/orchestration-executor-autonomy.test.ts — scheduler dispatch with role=executor, mock executor submits evidence and slice gate auto-accepts; per-slice executor_binding propagation. - tests/orchestration-autonomous-goal.test.ts — autonomous goal spawns spec-agent → opens spec gate → user accepts → planner spawned; non-autonomous goals don't spawn anything. Out of scope (deferred): real LLM-driven sub-agent execution (tests use mock runners), cron/daily digest cadence, USD cost calculation, off-plan slice intent-drift detection, docs/13 specialist routing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

devteapot and others added 16 commits April 25, 2026 19:54

docs: extend orchestration spec, add meta-runtime and agent identity

773cfd6

feat(config): add ACP adapter capability declarations

8567198

feat(orchestration): start autonomous coordinator at runtime attach

f9983e5

docs(orchestration): harden autonomous role prompts

d2193fd

feat: enforce acp adapter capabilities

4661d39

Add autonomous runtime smoke wiring

47edf7c

Harden autonomous coordinator eventing

269521c

Harden autonomous spawn idempotency

0471c09

Add autonomous lifecycle executor tracking

413b672

Move executor lifecycle tracking to scheduler

f064c61

Track autonomous executor lifecycle completion

3ed0cdc

Harden autonomous executor lifecycle

752daf7

fix: harden orchestration approval flow

f5a683c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(orchestration): docs/12 spec/planner/executor pipeline + audit fixes#1

feat(orchestration): docs/12 spec/planner/executor pipeline + audit fixes#1
devteapot wants to merge 16 commits into
mainfrom
docs12-audit-fixes

devteapot commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

devteapot commented Apr 30, 2026

Summary

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants