Release Checklist

Gating criteria for releasing the dual-mode Agent Mail (MCP server + CLI).

Primary Beads: br-3vwi.12.1, br-3vwi.12.2 Track: br-3vwi.12 (Rollout governance, release gates, feedback loop) Last Updated: 2026-02-13

Staged Rollout Gate Matrix

Phase 0 packet complete: CI + local validation evidence attached
Phase 1 packet complete: 24-48h canary metrics + incident log review attached
Phase 2 packet complete: 25%/50%/100% ring promotions each signed separately
Phase 3 packet complete: GA sign-off + ongoing monitoring owners recorded
Kill-switch owner for each V2 surface is named and on-call reachable
Rollback communication template reviewed and ready for use

Measurable Promotion Criteria (All phases)

Gate family	Hard threshold (promotion blocker)	Machine-check source	Artifact evidence
Unit + integration correctness	Pass rate = `100%` (`fail=0`)	`cargo test --workspace` and CI gate entry `Unit + integration tests=status:pass`	CI logs + gate report JSON
Dual-mode E2E correctness	Pass rate = `100%` (`fail=0`) for `E2E dual-mode` and `E2E mode matrix`	`am e2e run --project . dual_mode`, `am e2e run --project . mode_matrix`, and CI gate report	`tests/artifacts/dual_mode/*/run_summary.json`
Security/privacy	Pass rate = `100%` (`fail=0`) for `E2E security/privacy`	`am e2e run --project . security_privacy` and CI gate report	`tests/artifacts/security_privacy//`
Accessibility	Pass rate = `100%` (`fail=0`) for `E2E TUI accessibility`	`am e2e run --project . tui_a11y` and CI gate report	`tests/artifacts/tui_a11y//`
Cross-platform native command portability	Pass rate = `100%` (`fail=0`) for native command matrix on Linux/macOS/Windows	CI job `native-command-matrix` in `.github/workflows/ci.yml`	`tests/artifacts/cli/native_command_matrix/<os>/summary.json`
Performance budgets	`perf_security_regressions=status:pass` + `perf_guardrails=status:pass` with no budget/delta violations	`cargo test -p mcp-agent-mail-cli --test perf_security_regressions -- --nocapture`, `cargo test -p mcp-agent-mail-cli --test perf_guardrails -- --nocapture`, and CI gate report	`tests/artifacts/cli/perf_security/`, `tests/artifacts/cli/perf_guardrails/`, benchmark artifacts
Determinism	Golden/export checks report zero mismatches	`am golden verify` and static export tests	`benches/golden/checksums.sha256`, `tests/artifacts/share//`
Automation/governance	CI report has `decision=\"go\"`, `release_eligible=true`, and sign-off row completed	`am ci --report tests/artifacts/ci/gate_report.json`	`tests/artifacts/ci/gate_report.json`, sign-off ledger row

Gate-to-Bead Evidence Map

Gate family	Required bead outputs	Evidence path
Unit/integration + harness coverage	`br-3vwi.10` track outputs (`mode_matrix_harness`, `semantic_conformance`)	CI logs + `tests/artifacts/dual_mode/*`
Security/privacy	`br-3vwi.10.14` security/privacy E2E suite	`tests/artifacts/security_privacy/*`
Accessibility	`br-3vwi.10.13` keyboard/focus/contrast suite	`tests/artifacts/tui_a11y/*`
Cross-platform portability	`br-3lc7f` native command matrix evidence	`tests/artifacts/cli/native_command_matrix/<os>/summary.json`
Performance	`br-3vwi.10.11` perf regression script pack	perf regression logs and trend artifacts
Deterministic replay/export	`br-3vwi.10.19` + `br-3vwi.10.22`	replay artifacts + share/export artifacts
Rollout governance + operator readiness	`br-3vwi.11.1` + `br-3vwi.12.1` + `br-3vwi.12.2`	`docs/ROLLOUT_PLAYBOOK.md`, `docs/RELEASE_CHECKLIST.md`, CI gate report

Functional Readiness

Dual-Mode Interface (ADR-001)

MCP binary (mcp-agent-mail) denies CLI-only commands with exit 2
CLI binary (am) accepts all 22+ command families
Denial message includes command name, allowed commands, and remediation hint
No env variable (INTERFACE_MODE, etc.) can bypass the denial gate
Case variants of allowed commands are denied (e.g., Serve, CONFIG)
mcp-agent-mail serve --help exits 0
mcp-agent-mail config exits 0
All CLI parity commands implemented (messaging, contacts, reservations, agents, tooling)

TUI Screens

Tests

Dual-Mode Test Suites

Mode matrix harness: 22 CLI-allow + 16 MCP-deny + 2 MCP-allow

cargo test -p mcp-agent-mail-cli --test mode_matrix_harness

Semantic conformance: 10 SC tests (DB parity, validation, drift report)
```
cargo test -p mcp-agent-mail-cli --test semantic_conformance
```
Perf/security regressions: 13 tests (latency budgets, bypass attempts)
```
cargo test -p mcp-agent-mail-cli --test perf_security_regressions
```
Perf migration guardrails: native-vs-legacy budgets + unavailable rationale capture
```
cargo test -p mcp-agent-mail-cli --test perf_guardrails
```

Help snapshots match golden fixtures

cargo test -p mcp-agent-mail-cli --test help_snapshots

E2E dual-mode: 84+ assertions (7 sections)
```
am e2e run --project . dual_mode
```
E2E mode matrix: 42+ assertions
```
am e2e run --project . mode_matrix
```

Performance

Startup probes complete in <2 seconds
Event ring buffer bounded (no memory leak under load)
Commit coalescer batches effectively under load
DB pool sized appropriately (25 + 75 overflow)
No sustained lock contention at steady state

Documentation

Artifact Sanity Checks

Before release, verify test artifacts are consistent and complete:

# 1. Dual-mode E2E artifacts exist and show 0 failures
ls tests/artifacts/dual_mode/*/run_summary.json
cat tests/artifacts/dual_mode/*/run_summary.json
# e2e_fail must be 0, e2e_pass >= 84

# 2. No failure bundles generated
ls tests/artifacts/dual_mode/*/failures/
# Should be empty (no fail_*.json files)

# 3. Per-step structured logs exist
ls tests/artifacts/dual_mode/*/steps/step_*.json | wc -l
# Should be >= 42 (one per test step)

# 4. Golden snapshot checksums are current
am golden verify
# All checksums must match

# 4b. Native check-inbox command is available
am check-inbox --help
# Compatibility shim fallback (only for native regressions):
# PATH="/data/tmp/cargo-target/release:$PATH" legacy/hooks/check_inbox.sh --help

# 5. Verify-live E2E artifacts exist (native path authoritative)
am e2e run --project . share_verify_live
ls tests/artifacts/share_verify_live/*/case_*/command.txt
ls tests/artifacts/share_verify_live/*/case_*/check_trace.jsonl
# If command unavailable in current binary, suite emits deterministic SKIP with reason

# 6. Golden denial fixtures exist
ls tests/fixtures/golden_snapshots/mcp_deny_*.txt
# At least 5 files (share, guard, doctor, archive, migrate)

# 7. Machine-readable gate report exists and is release-eligible
am ci --report tests/artifacts/ci/gate_report.json
jq '.decision, .release_eligible, .thresholds, (.gates | length)' tests/artifacts/ci/gate_report.json
# decision must be "go", release_eligible must be true, thresholds must have zero failed_gates

Rollout Validation

Before promoting

Run the full CI suite (includes all dual-mode gates):

am ci --report tests/artifacts/ci/gate_report.json

Confirm gate report decision:

jq '.decision' tests/artifacts/ci/gate_report.json
# "go" required for promotion

Or run individual gates:

cargo test --workspace
cargo test -p mcp-agent-mail-conformance
cargo test -p mcp-agent-mail-cli --test mode_matrix_harness
cargo test -p mcp-agent-mail-cli --test semantic_conformance
cargo test -p mcp-agent-mail-cli --test perf_security_regressions
cargo test -p mcp-agent-mail-cli --test perf_guardrails
cargo test -p mcp-agent-mail-cli --test help_snapshots
am e2e run --project . dual_mode
am e2e run --project . mode_matrix
am e2e run --project . security_privacy
am e2e run --project . tui_a11y

Manual smoke test:

# MCP denial gate works:
mcp-agent-mail share 2>&1        # Should exit 2 with denial message

# CLI accepts all commands:
am share --help                  # Should exit 0
am doctor check --json           # Should exit 0 with JSON output

# Native deployment validation path:
am share deploy verify-live https://example.github.io/agent-mail --bundle /tmp/agent-mail-bundle --json > /tmp/verify-live.json
jq '.verdict, .summary' /tmp/verify-live.json

# Cloudflare Pages tooling + validation path:
am share deploy tooling /tmp/agent-mail-bundle
test -f /tmp/agent-mail-bundle/.github/workflows/deploy-cf-pages.yml
test -f /tmp/agent-mail-bundle/wrangler.toml.template
am share deploy verify-live https://example.pages.dev --bundle /tmp/agent-mail-bundle --json > /tmp/verify-live-cf.json
jq '.verdict, .summary' /tmp/verify-live-cf.json

# MCP server starts:
mcp-agent-mail serve --help      # Should exit 0

Start am and verify TUI:
- TUI renders correctly (all 11 screens load)
- Keybindings respond (Tab, 1-8, ?, q)
- Command palette opens (Ctrl+P)
- System Health shows green status

Test headless mode:

mcp-agent-mail serve --no-tui &
curl -s http://127.0.0.1:8765/mcp/ \
  -H "Authorization: Bearer $HTTP_BEARER_TOKEN" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}'
# Should return 34 tools

Post-Release Monitoring

Signals to Watch (First 24-48 Hours)

Signal	How to check	Expected range	Action if abnormal
MCP denial rate	`grep "not an MCP server command" <logs>`	0 from agents	Investigate agent config
CLI exit codes	Operator workflow logs	Exit 0 for all commands	Check migration guide
DB lock contention	`resource://tooling/locks`	No increase from baseline	Check pool sizing
Tool latency p95	`resource://tooling/metrics`	Within baseline SLOs	Profile hot path
Disk usage growth	`du -sh ~/.mcp_agent_mail/`	Stable growth rate	Check archive retention
Error rate	Application logs	No new error classes	Triage by error type

Error Classes to Monitor

Error class	Severity	Likely cause	Runbook action
Exit code 2 from agent sessions	High	Agent invoking CLI command on MCP binary	Fix agent config
"not an MCP server command" in MCP logs	Medium	Misrouted command	Check binary path in config
"database is locked" spike	Medium	Pool exhaustion under new load	Increase pool size
Panic/backtrace in denial stderr	Critical	Bug in denial gate	Activate kill-switch
CLI command succeeds on MCP binary	Critical	Denial gate bypass	Activate kill-switch immediately

Escalation Path

Non-critical: File a bead, fix in next release
Medium: Fix within 24 hours, deploy hotfix
High/Critical: Execute kill-switch procedure

Monitoring Dashboard Queries

# Count denial gate hits in last hour
grep -c "not an MCP server command" /var/log/mcp-agent-mail/*.log

# Check for any panics
grep -i "panic\|backtrace" /var/log/mcp-agent-mail/*.log

# Tool latency percentiles (via MCP resource)
curl -s http://127.0.0.1:8765/mcp/ \
  -d '{"jsonrpc":"2.0","id":1,"method":"resources/read","params":{"uri":"resource://tooling/metrics"}}' \
  | jq '.result.contents[0].text | fromjson | .tools[] | {name, call_count, p95_ms}'

# Active locks
curl -s http://127.0.0.1:8765/mcp/ \
  -d '{"jsonrpc":"2.0","id":1,"method":"resources/read","params":{"uri":"resource://tooling/locks"}}' \
  | jq '.result.contents[0].text | fromjson | .summary'

V2 Surface Ownership and Kill-Switch Mapping

Surface	Activation boundary	Kill-switch action	Primary owner	Secondary owner
MCP interface mode	`AM_INTERFACE_MODE` policy + binary separation	Clear CLI mode env and redeploy MCP binary	Runtime owner	Release owner
CLI workflows (`am`)	CLI binary rollout ring	Roll back `am` to last-known-good release	CLI owner	Runtime owner
TUI console	`TUI_ENABLED=true` and launch profile	Restart with `--no-tui`	TUI owner	Runtime owner
Static export pipeline	publish workflow gates	Disable publish jobs and hold exports	Docs/release owner	CLI owner
Build slots/worktrees	`WORKTREES_ENABLED=true` only after canary	Set `WORKTREES_ENABLED=false` and restart	Runtime owner	Storage owner
Local auth posture	bearer/JWT policy	Re-enable strict auth (`HTTP_ALLOW_LOCALHOST_UNAUTHENTICATED=0`)	Security owner	Runtime owner

Incident Rollback Timelines and Communication Paths

Milestone	Target timeline	Channel	Required payload
Incident acknowledged	<= 5 minutes	on-call chat	incident ID, suspected surface, owner
Kill-switch decision	<= 10 minutes	incident bridge	go/no-go with rationale
Rollback executed	<= 15 minutes	deployment channel	command/runbook step + env scope
Operator notice	<= 20 minutes	operator channel + thread	user impact + workaround
Evidence bundle posted	<= 60 minutes	bead thread + incident doc	logs, artifacts, reproduction

Release Sign-Off Ledger (Required)

Fill one row per phase promotion decision.

Sign-Off Workflow (Required For Every Promotion)

Run full gates and emit report: am ci --report tests/artifacts/ci/gate_report.json.
Confirm report fields: decision == "go" and release_eligible == true.
Attach at least one artifact link per gate family (correctness, security/privacy, accessibility, performance, determinism).
Record owner, UTC timestamp, and rationale in the ledger row for the phase transition.
If any threshold fails, mark decision no-go, document blocker bead IDs, and do not promote.

Non-Quick Candidate Cadence (br-3vwi.12.3.3)

Run a non-quick gate report at least once every 24 hours during active rollout and within 12 hours before any promotion decision.

run_ts="$(date -u +%Y%m%d_%H%M%S)"
am ci --report "tests/artifacts/ci/${run_ts}/case_02_report.json"
jq '.decision, .release_eligible, .summary' "tests/artifacts/ci/${run_ts}/case_02_report.json"

Latest non-quick artifact snapshot:

tests/artifacts/ci/20260213_031050/case_02_report.json
decision="no-go", release_eligible=false, summary={total:13, pass:4, fail:9, skip:0}

Owner rotation (weekly, Monday 00:00 UTC handoff):

Primary owner	Backup owner	Responsibility
Release owner	CI maintainer	Run non-quick report, update docs with latest artifact path + verdict, record ledger evidence links
CI maintainer	Agent integration lead	Verify report schema/completeness and flag stale artifact age (>24h)
Agent integration lead	On-call operator	Confirm rollout thread + bead updates reference latest artifact before promotion

Phase	Decision (`go`/`no-go`)	Owner	UTC timestamp	Rationale	Evidence links
Phase 0 -> Phase 1
Phase 1 -> Phase 2
Phase 2 (25% -> 50%)
Phase 2 (50% -> 100%)
Phase 3 (GA confirmation)

Post-Launch Telemetry Feedback Loop (br-3vwi.12.3)

Latest release-candidate gate artifact is non-quick and stored at tests/artifacts/ci/20260213_031050/case_02_report.json
Current reference non-quick artifact reviewed: tests/artifacts/ci/20260213_031050/case_02_report.json (mode=full, reviewed 2026-02-13T03:17Z)
Gate decision is go and release_eligible is true
Projected-vs-observed summary is updated in docs/ROLLOUT_PLAYBOOK.md Section 9
Follow-up bead set is reviewed and triaged:
- br-3vwi.12.3.1 (SearchScope compile blockers)
- br-3vwi.12.3.2 (clippy rate-limiter lint blockers)
- br-3vwi.12.3.3 (non-quick gate cadence + artifact publication)
Owner and timestamp recorded in the sign-off ledger row for this phase

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release Checklist

Staged Rollout Gate Matrix

Measurable Promotion Criteria (All phases)

Gate-to-Bead Evidence Map

Functional Readiness

Dual-Mode Interface (ADR-001)

TUI Screens

Tests

Dual-Mode Test Suites

Performance

Documentation

Artifact Sanity Checks

Rollout Validation

Before promoting

Post-Release Monitoring

Signals to Watch (First 24-48 Hours)

Error Classes to Monitor

Escalation Path

Monitoring Dashboard Queries

V2 Surface Ownership and Kill-Switch Mapping

Incident Rollback Timelines and Communication Paths

Release Sign-Off Ledger (Required)

Sign-Off Workflow (Required For Every Promotion)

Non-Quick Candidate Cadence (br-3vwi.12.3.3)

Post-Launch Telemetry Feedback Loop (br-3vwi.12.3)

Reference

FilesExpand file tree

RELEASE_CHECKLIST.md

Latest commit

History

RELEASE_CHECKLIST.md

File metadata and controls

Release Checklist

Staged Rollout Gate Matrix

Measurable Promotion Criteria (All phases)

Gate-to-Bead Evidence Map

Functional Readiness

Dual-Mode Interface (ADR-001)

TUI Screens

Tests

Dual-Mode Test Suites

Performance

Documentation

Artifact Sanity Checks

Rollout Validation

Before promoting

Post-Release Monitoring

Signals to Watch (First 24-48 Hours)

Error Classes to Monitor

Escalation Path

Monitoring Dashboard Queries

V2 Surface Ownership and Kill-Switch Mapping

Incident Rollback Timelines and Communication Paths

Release Sign-Off Ledger (Required)

Sign-Off Workflow (Required For Every Promotion)

Non-Quick Candidate Cadence (br-3vwi.12.3.3)

Post-Launch Telemetry Feedback Loop (br-3vwi.12.3)

Reference