Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 22 additions & 7 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,14 +74,27 @@ Edit the relevant `SKILL.md` or data file. Test by running the skill locally wit

## Testing

There is no automated test harness for skills — they are instruction sets interpreted by Claude Code, not code with unit tests. The validation steps are:
The repo has a multi-tier automated test suite documented in [`tests/README.md`](tests/README.md). Run it before opening a PR:

1. **Load the plugin**: `claude --plugin-dir .` — confirm no startup errors.
2. **Run the skill manually**: invoke `/discover-workflows` or `/install-workflow` and walk through the flow.
3. **Validate lock files** (if you changed `.lock.yml` files): `gh aw validate` — safe, does not recompile.
4. **Check grep counts** (if you applied the OAuth tweak): see [skills/install-workflow/auth.md](skills/install-workflow/auth.md#step-4--verify-the-tweak-shape).
```bash
./tests/run-tests.sh # tier-1 + tier-2 (default, ~4-5 min)
./tests/run-tests.sh --verbose # show per-assertion output
```

| Tier | What it covers | When to run |
|---|---|---|
| 2 (fast) | Grep/filesystem invariants — no Claude invocation, <1s | Always |
| 1 (~4-5 min) | Skill sanity: loads, mentions required commands, documents both auth paths, hard rules intact | Every PR |
| 3 (slow, opt-in) | Full end-to-end pipeline on the playground repo; skill E2E (destructive — provisions a throwaway repo) | Before releases, manually |

CI runs tier-2 + tier-1 on every PR and push to `main`. Tier-3 tests are not run in CI; run them manually with `./tests/test-e2e.sh` (see `tests/README.md` for options).

Additional checks:

- **Validate lock files** (if you changed `.lock.yml` files): `gh aw validate` — safe, does not recompile.
- **Check grep counts** (if you applied the OAuth tweak): see [skills/install-workflow/auth.md](skills/install-workflow/auth.md#step-4--verify-the-tweak-shape).

Never test by committing untested changes to `main`. The installed workflows run on push to `main`, so a broken install skill or a bad `.lock.yml` will trigger a live workflow run.
Never push untested changes directly to `main`. The installed workflows run on push to `main`, so a broken skill or a bad `.lock.yml` triggers a live workflow run.

## Workflow files

Expand Down Expand Up @@ -112,4 +125,6 @@ Branch naming conventions:

## Publishing (maintainers only)

See the [Publishing section of the README](README.md#publishing) for the steps to submit the plugin to the Claude plugin registry.
1. Bump the version in `.claude-plugin/plugin.json` and `.claude-plugin/marketplace.json` to match (semver).
2. Create a GitHub release tagged `v<version>` with a changelog entry.
3. Notify the registries — the plugin is listed on [claude-plugins.dev](https://claude-plugins.dev) and [ClaudePluginHub](https://claudepluginhub.com); submit the updated `marketplace.json` URL to each per their submission process.
28 changes: 27 additions & 1 deletion catalog/agent-team/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,33 @@ Then apply the OAuth token tweak to each `.lock.yml` per [`skills/install-workfl
1. Open an issue describing what you want built.
2. Add the single label `agent-team`.
3. Watch the thread. Each role posts its contribution as a comment; the implementer opens a draft PR that closes the issue when merged.
4. Human override at any time: add `state:blocked` to halt, edit a comment to steer the next agent, or manually `gh workflow run` a specific role to retry a stuck stage. Manual dispatches must pass the required `workflow_dispatch` inputs, and the downstream workflow markdown must read them via `${{ github.event.inputs.* }}`.
4. Human override at any time: add `state:blocked` to halt, edit a comment to steer the next agent, or manually dispatch a specific role to retry a stuck stage. Agents fail loudly if required inputs are missing — always pass every required input explicitly:

```bash
# Re-run the planner (e.g. spec was updated, redo planning from iteration 2):
gh workflow run planner-agent.lock.yml \
-f issue_number=42 \
-f iteration=2

# Re-run the implementer with no existing PR (start fresh):
gh workflow run implementer-agent.lock.yml \
-f issue_number=42 \
-f iteration=2

# Re-run the implementer pushing to an existing PR:
gh workflow run implementer-agent.lock.yml \
-f issue_number=42 \
-f iteration=2 \
-f pr_number=7

# Re-run the reviewer:
gh workflow run reviewer-agent.lock.yml \
-f pr_number=7 \
-f issue_number=42 \
-f iteration=2
```

If an agent posts `🛑 agent-team: workflow_dispatch inputs were not propagated`, a dispatch reached it with missing inputs. Re-dispatch manually using the commands above with explicit values.
5. **Retrying a blocked task**: clear `state:blocked`, then re-add `agent-team`. Spec-agent treats it as a fresh dispatch (because the state:* labels are gone and the spec markers are already satisfied — actually: to redo from scratch, also delete the prior spec comment).

## Limits and gotchas
Expand Down