This protocol standardizes how AutoLoop phases P1 through P13 are described, reviewed, implemented, and reported by AI agents.
- Applies to phase plans and status outputs for
P1..P13. - Applies to both design docs and execution reports.
- Uses neutral software language only.
Every AI output for a phase MUST follow this structure:
phase_metaobjectiveinvariantsmodule_scopedata_modelsflowguardrailstestsacceptancerollbackdependency_graphstatus
{
"phase_meta": {
"phase_id": "P7",
"title": "Identity and Tenant Sovereignty",
"priority": "highest",
"version": "v1",
"owner": "autoloop-core"
},
"objective": {
"problem": "what gap is closed",
"outcome": "observable engineering outcome"
},
"invariants": [
"hard rule 1",
"hard rule 2"
],
"module_scope": {
"write_modules": ["src/security", "src/runtime"],
"read_modules": ["src/session", "src/contracts"],
"external_surfaces": ["CLI", "dashboard", "state_store adapter"]
},
"data_models": [
{
"name": "SessionLease",
"kind": "table|dto|event",
"keys": ["session_id", "tenant_id"],
"append_only": true
}
],
"flow": {
"trigger": "entry trigger",
"steps": [
"step 1",
"step 2",
"step 3"
],
"exit_conditions": ["condition A", "condition B"]
},
"guardrails": {
"runtime_gate": ["budget", "timeout", "sandbox", "token"],
"policy_gate": ["approval", "permission", "tenant boundary"],
"audit_trace": ["trace_id", "session_id", "task_id", "capability_id", "version"]
},
"tests": {
"unit": ["unit case 1"],
"integration": ["integration case 1"],
"e2e": ["e2e case 1"]
},
"acceptance": [
"measurable criterion 1",
"measurable criterion 2"
],
"rollback": {
"trigger": "when to rollback",
"action": "how to rollback",
"target_version": "v1"
},
"dependency_graph": {
"depends_on": ["P6"],
"unblocks": ["P8", "P9"]
},
"status": {
"stage": "planned|in_progress|done|blocked",
"completion": 0.0,
"risks": ["risk 1"],
"next_actions": ["next action 1"]
}
}Use this prompt for any model:
You are producing an AutoLoop phase report under P1-P13 Unified Specification Protocol v1.
Output valid JSON only.
Follow the required keys exactly:
phase_meta, objective, invariants, module_scope, data_models, flow, guardrails, tests, acceptance, rollback, dependency_graph, status.
Use measurable acceptance criteria.
Use explicit module paths.
Use append-only traceability language where applicable.
Do not add political metaphors or non-technical labels.
| Phase | Core Target | Primary Modules | Core Acceptance |
|---|---|---|---|
| P1 | State machine lock | src/session, src/orchestration |
legal transitions only + reject/revise loops |
| P2 | Runtime guard mandatory | src/runtime, src/tools, src/providers |
all execution through unified gate |
| P3 | Traceable append-only events | src/observability, adapter, state_store |
replayable chain with full IDs |
| P4 | 8-layer implementation alignment | src/lib.rs + layer modules |
each layer has explicit interface mapping |
| P5 | Integration and escape tests | tests/ |
no runtime bypass + stable matrix |
| P6 | Gray rollout | deploy/config, cli system |
shadow/canary/full/rollback operable |
| P7 | Identity and tenant sovereignty | src/security, src/session, src/runtime |
cross-tenant access denied |
| P8 | Cost and budget ledger | src/runtime, src/observability, adapter |
task-level cost decomposition + reconciliation |
| P9 | Capability supply chain trust | src/tools, src/providers, src/security |
only verified+trusted+active execution |
| P10 | Deterministic replay boundary | src/observability/event_stream, src/runtime |
snapshot replay + explainable drift |
| P11 | Recovery/degrade/chaos | src/runtime, src/orchestration, deploy/ |
MTTR + degrade success targets met |
| P12 | Governed learning loop | src/memory, src/rag, src/hooks, src/security |
learning must pass gate before promotion |
| P13 | Business loop mapping | src/observability, dashboard-ui, src/lib.rs |
revenue-cost-profit-risk traceable to task |
- Receive approval/rejection command.
- Bind
session_id+ reason + actor context. - Emit policy decision event.
- Trigger retry/replan or continue.
- Load policy bindings and constraints.
- Evaluate intent, risk class, budget class.
- Produce
approved/rejected/revise. - Persist decision artifact.
- Convert approved intent into execution plan.
- Build dependency graph and task envelopes.
- Route to stable/adaptive pools.
- Push execution queue with trace IDs.
- Pick executable capability from catalog.
- Hand off to runtime kernel only.
- Record outcome and raw feedback.
- Return structured run receipt.
- Enforce identity/tenant and lease validity.
- Enforce CPU/memory/timeout/token/I/O budgets.
- Enforce circuit breaker state machine.
- Execute, settle ledger, emit runtime events.
- Validate output quality and policy compliance.
- Mark pass/iterate/reject and attach evidence.
- Emit audit events and blocker records.
- Route reject paths back to planning loop.
- Ingest verified evidence and witness logs.
- Build/refine entity-relation memory graph.
- Update skill/causal/session strategy memory.
- Publish learning deltas for next routing cycle.
- Aggregate session/task/capability metrics.
- Emit dashboard snapshots and replay streams.
- Publish cost-risk-margin-SLA reports.
- Support operator forensics and replay.
- Any missing required key => invalid protocol output.
- Any metric without trace IDs => invalid.
- Any execution path bypassing runtime gate => invalid.
- Any cross-tenant access success => invalid.
- Pick a phase (
P1..P13). - Ask model to output JSON using Section 4 prompt.
- Validate keys against Section 2.
- Compare acceptance criteria against Section 5 and current tests.
- Store report as append-only artifact.
Each phase has a concise core flow for implementation and review:
P1 State Machine Lock- Define states/signals -> enforce legal transitions only -> persist transition audit -> expose reject/revise loop.
P2 Runtime Guard- Build task envelope -> enforce identity/budget/sandbox/token gate -> execute through single kernel path -> emit runtime decision.
P3 Traceable Event Model- Normalize IDs (
trace/session/task/capability/version) -> append event stream -> build read views -> verify replay chain integrity.
- Normalize IDs (
P4 8-Layer Mapping- Map interfaces to layers -> wire layer boundaries in code -> enforce contracts in orchestration path -> validate no cross-layer bypass.
P5 Integration and Escape Tests- Build transition matrix tests -> run bypass/escape guards -> run reject/revise e2e -> assert performance and stability baseline.
P6 Gray Rollout- Shadow observe -> canary enforce (10%-30%) -> full enforce -> keep rollback switch/version target.
P7 Identity and Tenant Sovereignty- Authenticate request -> bind tenant/principal/policy/lease -> runtime validates context every execution -> reject cross-tenant access.
P8 Budget and Cost Ledger- Precharge budget -> execute -> settle ledger + attribution -> reconcile account and append cost reports.
P9 Capability Supply Chain Trust- Register capability artifact -> verify signature/provenance/SBOM -> approve into active catalog -> runtime re-check before invoke.
P10 Deterministic Replay Boundary- Snapshot inputs/routes/digests -> replay under fixed boundary -> detect deviation -> emit explainable replay report.
P11 Recovery and Degrade Sovereignty- Detect fault -> switch degrade profile -> recover/circuit reset -> trigger manual takeover point when needed.
P12 Governed Learning Loop- Propose learning delta -> verifier gate -> limited trial -> promote or rollback -> commit to long-term memory/graph.
P13 Business Loop Mapping- Create work order -> aggregate cost + revenue -> produce margin/SLA/risk outputs -> trace every metric back to task.
The full system loop is:
Understand -> Plan -> Execute -> Verify -> Learn -> Evolve -> Repeat
Mapped to phases:
Understand- P1, P7, P4
- Intake state + identity binding + layer contract context.
Plan- P1, P4, P6
- Policy review, orchestration, rollout strategy.
Execute- P2, P8, P9, P11
- Guarded execution, cost controls, trusted capability path, degrade handling.
Verify- P3, P5, P10
- Append-only trace, integration checks, deterministic replay and drift analysis.
Learn- P12
- Evidence-gated learning proposals and promotion pipeline.
Evolve- P6, P9, P12, P13
- Controlled rollout, capability governance, skill evolution, business optimization.
Repeat- P3 + P1 baseline
- Persisted event history feeds the next cycle with traceable state and constraints.
- Request enters with identity and constraints (
P1/P7). - Plan and route are generated under layer contracts (
P4). - Runtime kernel executes with hard gates (
P2/P8/P9). - Verifier/audit evaluates outcomes and replay evidence (
P3/P5/P10). - Recovery/degrade triggers when needed (
P11). - Learning gate decides promotion or rollback (
P12). - Business reports update margin/SLA/risk and close loop (
P13). - Next cycle starts with updated memory, capability posture, and rollout policy (
P6).