Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
e4cdd5b
Refresh architecture and operator docs
WellDunDun Mar 15, 2026
eedaa58
Merge remote-tracking branch 'origin/dev' into WellDunDun/winnipeg-v6
WellDunDun Mar 15, 2026
a1168f6
Merge remote-tracking branch 'origin/dev' into WellDunDun/winnipeg-v6
WellDunDun Mar 15, 2026
55c6339
docs: align docs and skill workflows with autonomy-first operator path
WellDunDun Mar 15, 2026
657d51a
feat: autoresearch-inspired UX improvements for demo readiness
WellDunDun Mar 15, 2026
3bea7f7
feat: show selftune resource usage instead of session-level metrics i…
WellDunDun Mar 15, 2026
41a4b9d
refactor: consolidate CLI from 28 flat commands to 21 grouped commands
WellDunDun Mar 15, 2026
43f2d70
fix: address CodeRabbit PR review comments
WellDunDun Mar 15, 2026
35f67a4
fix: address remaining review comments and stale replay references
WellDunDun Mar 15, 2026
3085cba
fix: resolve CI lint failures
WellDunDun Mar 15, 2026
4165c15
fix: escape backslash in SQL LIKE pattern to satisfy CodeQL
WellDunDun Mar 15, 2026
aababd8
feat: add system status page to dashboard with doctor diagnostics
WellDunDun Mar 15, 2026
1f41dca
fix: add ESCAPE clause to LIKE query and fix stale replay label
WellDunDun Mar 15, 2026
4b186b4
fix: address CodeRabbit review comments on status page PR
WellDunDun Mar 15, 2026
1e9f785
docs: establish agent-first architecture principle across repo
WellDunDun Mar 15, 2026
ea62c6d
feat: demo-ready P0 fixes from architecture audit
WellDunDun Mar 15, 2026
5191bf1
fix: defensive checks fallback and clarify reserved counters
WellDunDun Mar 15, 2026
c969b6d
refactor: prioritize Claude Code, unify cron/schedule, remove dead code
WellDunDun Mar 15, 2026
a1a2a85
docs: document two operating modes and data architecture
WellDunDun Mar 15, 2026
79ac970
docs: rewrite all 22 workflow docs for agent-first consistency
WellDunDun Mar 15, 2026
56ec3c9
docs: add autonomous mode + connect agents to workflows
WellDunDun Mar 15, 2026
f08b067
refactor: remove duplicate findRecentlyEvolvedSkills function
WellDunDun Mar 15, 2026
f02ebc5
fix: address 21 CodeRabbit review comments
WellDunDun Mar 15, 2026
57dc28e
feat: real-time improvement signal detection and reactive orchestration
WellDunDun Mar 15, 2026
ca52515
docs: document signal-reactive improvement across architecture docs
WellDunDun Mar 15, 2026
6a54d04
fix: address 14 CodeRabbit review comments (round 4)
WellDunDun Mar 15, 2026
885af5d
docs: dependency map, README refresh, dashboard signal exec plan
WellDunDun Mar 15, 2026
21cc615
fix: migrate repo URLs from WellDunDun to selftune-dev org
WellDunDun Mar 15, 2026
08e4dd9
docs: add repo org/name migration to change propagation map
WellDunDun Mar 15, 2026
c10bcdb
fix: address 13 CodeRabbit review comments (round 5)
WellDunDun Mar 15, 2026
c9f39ff
fix: llms.txt branch-agnostic links, README experimental clarity
WellDunDun Mar 15, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 20 additions & 10 deletions .claude/agents/diagnosis-analyst.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,22 @@ Investigate why a specific skill is underperforming. Analyze telemetry logs,
grading results, and session transcripts to identify root causes and recommend
targeted fixes.

**Activate when the user says:**
- "diagnose skill issues"
- "why is skill X underperforming"
- "what's wrong with this skill"
- "skill failure analysis"
- "debug skill performance"
**Activation policy:** This is a subagent-only role, spawned by the main agent.
If a user asks for diagnosis directly, the main agent should route to this subagent.

## Connection to Workflows

This agent is spawned by the main agent as a subagent when deeper analysis is
needed — it is not called directly by the user.

**Connected workflows:**
- **Doctor** — when `selftune doctor` reveals persistent issues with a specific skill, spawn this agent for root cause analysis
- **Grade** — when grades are consistently low for a skill, spawn this agent to investigate why
- **Status** — when `selftune status` shows CRITICAL or WARNING flags on a skill, spawn this agent for a deep dive

The main agent decides when to escalate to this subagent based on severity
and persistence of the issue. One-off failures are handled inline; recurring
or unexplained failures warrant spawning this agent.

Comment thread
coderabbitai[bot] marked this conversation as resolved.
## Context

Expand Down Expand Up @@ -48,7 +58,7 @@ any warnings or regression flags.
### Step 3: Pull telemetry stats

```bash
selftune evals --skill <name> --stats
selftune eval generate --skill <name> --stats
```

Review aggregate metrics:
Expand All @@ -59,7 +69,7 @@ Review aggregate metrics:
### Step 4: Analyze trigger coverage

```bash
selftune evals --skill <name> --max 50
selftune eval generate --skill <name> --max 50
```

Review the generated eval set. Count entries by invocation type:
Expand Down Expand Up @@ -106,8 +116,8 @@ Compile findings into a structured report.
|---------|---------|
| `selftune status` | Overall health snapshot |
| `selftune last` | Most recent session details |
| `selftune evals --skill <name> --stats` | Aggregate telemetry |
| `selftune evals --skill <name> --max 50` | Generate eval set for coverage analysis |
| `selftune eval generate --skill <name> --stats` | Aggregate telemetry |
| `selftune eval generate --skill <name> --max 50` | Generate eval set for coverage analysis |
| `selftune doctor` | Check infrastructure health |

## Output
Expand Down
15 changes: 14 additions & 1 deletion .claude/agents/evolution-reviewer.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,19 @@ vs. new descriptions, and provides an approve/reject verdict with reasoning.
- "review pending changes"
- "should I deploy this evolution"

## Connection to Workflows

This agent is spawned by the main agent as a subagent to provide a safety
review before deploying an evolution.

**Connected workflows:**
- **Evolve** — in the review-before-deploy step, spawn this agent to evaluate the proposal for regressions, scope creep, and eval set quality
- **EvolveBody** — same role for full-body and routing-table evolutions

**Mode behavior:**
- **Interactive mode** — spawn this agent before deploying an evolution to get a human-readable safety review with an approve/reject verdict
- **Autonomous mode** — the orchestrator handles validation internally using regression thresholds and auto-rollback; this agent is for interactive safety reviews only

## Context

You need access to:
Expand Down Expand Up @@ -114,7 +127,7 @@ Issue an approve or reject decision with full reasoning.
| Command | Purpose |
|---------|---------|
| `selftune evolve --skill <name> --skill-path <path> --dry-run` | Generate proposal without deploying |
| `selftune evals --skill <name>` | Check eval set used for validation |
| Read eval file from evolve output or audit log | Inspect the exact eval set used for validation |
| `selftune watch --skill <name> --skill-path <path>` | Check current performance baseline |
| `selftune status` | Overall skill health context |

Expand Down
24 changes: 18 additions & 6 deletions .claude/agents/integration-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,18 @@ verify the setup is working end-to-end.
- "get selftune working"
- "selftune setup guide"

## Connection to Workflows

This agent is the deep-dive version of the Initialize workflow, spawned by
the main agent as a subagent when the project structure is complex.

**Connected workflows:**
- **Initialize** — for complex project structures (monorepos, multi-skill repos, mixed agent platforms), spawn this agent instead of running the basic init workflow

**When to spawn:** when the project has multiple SKILL.md files, multiple
packages or workspaces, mixed agent platforms (Claude + Codex), or any
structure where the standard `selftune init` needs project-specific guidance.

## Context

You need access to:
Expand Down Expand Up @@ -90,8 +102,8 @@ Parse the output to confirm `~/.selftune/config.json` was created. Note the
detected `agent_type` and `cli_path`.

If the user is on a non-Claude agent platform:
- **Codex** — inform about `wrap-codex` and `ingest-codex` options
- **OpenCode** — inform about `ingest-opencode` option
- **Codex** — inform about `ingest wrap-codex` and `ingest codex` options
- **OpenCode** — inform about `ingest opencode` option

### Step 5: Install hooks

Expand All @@ -106,8 +118,8 @@ into `~/.claude/settings.json`. Three hooks are required:

Derive script paths from `cli_path` in `~/.selftune/config.json`.

For **Codex**: use `selftune wrap-codex` or `selftune ingest-codex`.
For **OpenCode**: use `selftune ingest-opencode`.
For **Codex**: use `selftune ingest wrap-codex` or `selftune ingest codex`.
For **OpenCode**: use `selftune ingest opencode`.

### Step 6: Verify with doctor

Expand Down Expand Up @@ -159,7 +171,7 @@ from any package directory.
Tell the user what to do next based on their goals:

- **"I want to see how my skills are doing"** — run `selftune status`
- **"I want to improve a skill"** — run `selftune evals --skill <name>` then `selftune evolve`
- **"I want to improve a skill"** — run `selftune eval generate --skill <name>` then `selftune evolve --skill <name>`
- **"I want to grade a session"** — run `selftune grade --skill <name>`

## Commands
Expand All @@ -170,7 +182,7 @@ Tell the user what to do next based on their goals:
| `selftune doctor` | Verify installation health |
| `selftune status` | Post-setup health check |
| `selftune last` | Verify telemetry capture |
| `selftune evals --list-skills` | Confirm skills are being tracked |
| `selftune eval generate --list-skills` | Confirm skills are being tracked |

## Output

Expand Down
23 changes: 18 additions & 5 deletions .claude/agents/pattern-analyst.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,19 @@ opportunities, and identify systemic issues affecting multiple skills.
- "skill trigger conflicts"
- "optimize my skills"

## Connection to Workflows

This agent is spawned by the main agent as a subagent for deep cross-skill
analysis.

**Connected workflows:**
- **Composability** — when `selftune eval composability` identifies conflict candidates, spawn this agent for deeper investigation of trigger overlaps and resolution strategies
- **Evals** — when analyzing cross-skill patterns or systemwide undertriggering, spawn this agent to find optimization opportunities

**When to spawn:** when the user asks about conflicts between skills,
cross-skill optimization, or when composability scores indicate moderate-to-severe
conflicts (score > 0.3).

## Context

You need access to:
Expand All @@ -33,7 +46,7 @@ You need access to:
### Step 1: Inventory all skills

```bash
selftune evals --list-skills
selftune eval generate --list-skills
```

Parse the JSON output to get a complete list of skills with their query
Expand Down Expand Up @@ -77,7 +90,7 @@ Read `skill_usage_log.jsonl` and group by query text. Look for:
For each skill, pull stats:

```bash
selftune evals --skill <name> --stats
selftune eval generate --skill <name> --stats
```

Compare across skills:
Expand All @@ -100,10 +113,10 @@ Compile a cross-skill analysis report.

| Command | Purpose |
|---------|---------|
| `selftune evals --list-skills` | Inventory all skills with query counts |
| `selftune eval generate --list-skills` | Inventory all skills with query counts |
| `selftune status` | Health snapshot across all skills |
| `selftune evals --skill <name> --stats` | Per-skill aggregate telemetry |
| `selftune evals --skill <name> --max 50` | Generate eval set per skill |
| `selftune eval generate --skill <name> --stats` | Per-skill aggregate telemetry |
| `selftune eval generate --skill <name> --max 50` | Generate eval set per skill |

## Output

Expand Down
Loading
Loading