selftune-dev · WellDunDun · Mar 15, 2026 · Mar 15, 2026 · Mar 15, 2026 · Mar 15, 2026
diff --git a/.claude/agents/diagnosis-analyst.md b/.claude/agents/diagnosis-analyst.md
@@ -11,12 +11,22 @@ Investigate why a specific skill is underperforming. Analyze telemetry logs,
 grading results, and session transcripts to identify root causes and recommend
 targeted fixes.
 
-**Activate when the user says:**
-- "diagnose skill issues"
-- "why is skill X underperforming"
-- "what's wrong with this skill"
-- "skill failure analysis"
-- "debug skill performance"
+**Activation policy:** This is a subagent-only role, spawned by the main agent.
+If a user asks for diagnosis directly, the main agent should route to this subagent.
+
+## Connection to Workflows
+
+This agent is spawned by the main agent as a subagent when deeper analysis is
+needed — it is not called directly by the user.
+
+**Connected workflows:**
+- **Doctor** — when `selftune doctor` reveals persistent issues with a specific skill, spawn this agent for root cause analysis
+- **Grade** — when grades are consistently low for a skill, spawn this agent to investigate why
+- **Status** — when `selftune status` shows CRITICAL or WARNING flags on a skill, spawn this agent for a deep dive
+
+The main agent decides when to escalate to this subagent based on severity
+and persistence of the issue. One-off failures are handled inline; recurring
+or unexplained failures warrant spawning this agent.
 
 ## Context
 
@@ -48,7 +58,7 @@ any warnings or regression flags.
 ### Step 3: Pull telemetry stats
 
 ```bash
-selftune evals --skill <name> --stats
+selftune eval generate --skill <name> --stats
 ```
 
 Review aggregate metrics:
@@ -59,7 +69,7 @@ Review aggregate metrics:
 ### Step 4: Analyze trigger coverage
 
 ```bash
-selftune evals --skill <name> --max 50
+selftune eval generate --skill <name> --max 50
 ```
 
 Review the generated eval set. Count entries by invocation type:
@@ -106,8 +116,8 @@ Compile findings into a structured report.
 |---------|---------|
 | `selftune status` | Overall health snapshot |
 | `selftune last` | Most recent session details |
-| `selftune evals --skill <name> --stats` | Aggregate telemetry |
-| `selftune evals --skill <name> --max 50` | Generate eval set for coverage analysis |
+| `selftune eval generate --skill <name> --stats` | Aggregate telemetry |
+| `selftune eval generate --skill <name> --max 50` | Generate eval set for coverage analysis |
 | `selftune doctor` | Check infrastructure health |
 
 ## Output

diff --git a/.claude/agents/evolution-reviewer.md b/.claude/agents/evolution-reviewer.md
@@ -18,6 +18,19 @@ vs. new descriptions, and provides an approve/reject verdict with reasoning.
 - "review pending changes"
 - "should I deploy this evolution"
 
+## Connection to Workflows
+
+This agent is spawned by the main agent as a subagent to provide a safety
+review before deploying an evolution.
+
+**Connected workflows:**
+- **Evolve** — in the review-before-deploy step, spawn this agent to evaluate the proposal for regressions, scope creep, and eval set quality
+- **EvolveBody** — same role for full-body and routing-table evolutions
+
+**Mode behavior:**
+- **Interactive mode** — spawn this agent before deploying an evolution to get a human-readable safety review with an approve/reject verdict
+- **Autonomous mode** — the orchestrator handles validation internally using regression thresholds and auto-rollback; this agent is for interactive safety reviews only
+
 ## Context
 
 You need access to:
@@ -114,7 +127,7 @@ Issue an approve or reject decision with full reasoning.
 | Command | Purpose |
 |---------|---------|
 | `selftune evolve --skill <name> --skill-path <path> --dry-run` | Generate proposal without deploying |
-| `selftune evals --skill <name>` | Check eval set used for validation |
+| Read eval file from evolve output or audit log | Inspect the exact eval set used for validation |
 | `selftune watch --skill <name> --skill-path <path>` | Check current performance baseline |
 | `selftune status` | Overall skill health context |
 

diff --git a/.claude/agents/integration-guide.md b/.claude/agents/integration-guide.md
@@ -19,6 +19,18 @@ verify the setup is working end-to-end.
 - "get selftune working"
 - "selftune setup guide"
 
+## Connection to Workflows
+
+This agent is the deep-dive version of the Initialize workflow, spawned by
+the main agent as a subagent when the project structure is complex.
+
+**Connected workflows:**
+- **Initialize** — for complex project structures (monorepos, multi-skill repos, mixed agent platforms), spawn this agent instead of running the basic init workflow
+
+**When to spawn:** when the project has multiple SKILL.md files, multiple
+packages or workspaces, mixed agent platforms (Claude + Codex), or any
+structure where the standard `selftune init` needs project-specific guidance.
+
 ## Context
 
 You need access to:
@@ -90,8 +102,8 @@ Parse the output to confirm `~/.selftune/config.json` was created. Note the
 detected `agent_type` and `cli_path`.
 
 If the user is on a non-Claude agent platform:
-- **Codex** — inform about `wrap-codex` and `ingest-codex` options
-- **OpenCode** — inform about `ingest-opencode` option
+- **Codex** — inform about `ingest wrap-codex` and `ingest codex` options
+- **OpenCode** — inform about `ingest opencode` option
 
 ### Step 5: Install hooks
 
@@ -106,8 +118,8 @@ into `~/.claude/settings.json`. Three hooks are required:
 
 Derive script paths from `cli_path` in `~/.selftune/config.json`.
 
-For **Codex**: use `selftune wrap-codex` or `selftune ingest-codex`.
-For **OpenCode**: use `selftune ingest-opencode`.
+For **Codex**: use `selftune ingest wrap-codex` or `selftune ingest codex`.
+For **OpenCode**: use `selftune ingest opencode`.
 
 ### Step 6: Verify with doctor
 
@@ -159,7 +171,7 @@ from any package directory.
 Tell the user what to do next based on their goals:
 
 - **"I want to see how my skills are doing"** — run `selftune status`
-- **"I want to improve a skill"** — run `selftune evals --skill <name>` then `selftune evolve`
+- **"I want to improve a skill"** — run `selftune eval generate --skill <name>` then `selftune evolve --skill <name>`
 - **"I want to grade a session"** — run `selftune grade --skill <name>`
 
 ## Commands
@@ -170,7 +182,7 @@ Tell the user what to do next based on their goals:
 | `selftune doctor` | Verify installation health |
 | `selftune status` | Post-setup health check |
 | `selftune last` | Verify telemetry capture |
-| `selftune evals --list-skills` | Confirm skills are being tracked |
+| `selftune eval generate --list-skills` | Confirm skills are being tracked |
 
 ## Output
 

diff --git a/.claude/agents/pattern-analyst.md b/.claude/agents/pattern-analyst.md
@@ -19,6 +19,19 @@ opportunities, and identify systemic issues affecting multiple skills.
 - "skill trigger conflicts"
 - "optimize my skills"
 
+## Connection to Workflows
+
+This agent is spawned by the main agent as a subagent for deep cross-skill
+analysis.
+
+**Connected workflows:**
+- **Composability** — when `selftune eval composability` identifies conflict candidates, spawn this agent for deeper investigation of trigger overlaps and resolution strategies
+- **Evals** — when analyzing cross-skill patterns or systemwide undertriggering, spawn this agent to find optimization opportunities
+
+**When to spawn:** when the user asks about conflicts between skills,
+cross-skill optimization, or when composability scores indicate moderate-to-severe
+conflicts (score > 0.3).
+
 ## Context
 
 You need access to:
@@ -33,7 +46,7 @@ You need access to:
 ### Step 1: Inventory all skills
 
 ```bash
-selftune evals --list-skills
+selftune eval generate --list-skills
 ```
 
 Parse the JSON output to get a complete list of skills with their query
@@ -77,7 +90,7 @@ Read `skill_usage_log.jsonl` and group by query text. Look for:
 For each skill, pull stats:
 
 ```bash
-selftune evals --skill <name> --stats
+selftune eval generate --skill <name> --stats
 ```
 
 Compare across skills:
@@ -100,10 +113,10 @@ Compile a cross-skill analysis report.
 
 | Command | Purpose |
 |---------|---------|
-| `selftune evals --list-skills` | Inventory all skills with query counts |
+| `selftune eval generate --list-skills` | Inventory all skills with query counts |
 | `selftune status` | Health snapshot across all skills |
-| `selftune evals --skill <name> --stats` | Per-skill aggregate telemetry |
-| `selftune evals --skill <name> --max 50` | Generate eval set per skill |
+| `selftune eval generate --skill <name> --stats` | Per-skill aggregate telemetry |
+| `selftune eval generate --skill <name> --max 50` | Generate eval set per skill |
 
 ## Output