add PLAN_ROLLOUT proposal — PR-stack-aware planning#1192
Closed
mastermanas805 wants to merge 2 commits into
Closed
add PLAN_ROLLOUT proposal — PR-stack-aware planning#1192mastermanas805 wants to merge 2 commits into
mastermanas805 wants to merge 2 commits into
Conversation
Proposes two new skills + a declarative schema to address the gap between plan approval and shipping: - /plan-rollout: decomposes an approved plan into a reviewable PR stack and a rollout plan. Outputs decomposition.md + rollout.md consumed by /ship, /review, /spill-check, /land-and-deploy. - /spill-check: detects scope creep mid-implementation by comparing the current diff against the declared PR unit. - SYSTEM.md: repo-root declarative semantic contract graph — components, roles, role-level contracts with rollout-edge semantics. Reconciled against the LLM-discovered import graph at runtime. Includes a CEO plan (full spec), SKILL.md drafts, schema documentation, usage guide, integration notes for /ship and /review, and a TypeScript parser stub. The design was stress-tested end-to-end by simulating the workflow against honojs/hono issue #4633. 8 concrete design gaps surfaced by the dogfood are folded into v1 scope; documented in the CEO plan. Filing as a proposal doc in docs/designs/ to get directional feedback before opening the 4-PR implementation stack — see the attached issue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Author
|
@garrytan Requesting your suggestion |
Owner
|
Thanks @mastermanas805 — closing as deferred. PLAN_ROLLOUT is a 2164-line proposal warranting standalone focused review. Happy to revisit if scoped down to a focused slice. |
mastermanas805
added a commit
to mastermanas805/gstack
that referenced
this pull request
May 10, 2026
… docs First of the four PRs proposed in garrytan#1192. Pure foundation — library code and documentation, no skills added, no existing behavior changed. What lands: - lib/plan-rollout/types.ts Shared types for the plan-rollout family: Component, Contract, SystemMap, ImportEdge, ReconcileFlag, SystemMapParseError. Stable public surface so the follow-up /plan-rollout and /spill-check skills can depend on it. - lib/plan-rollout/system-map-parser.ts Parses a SYSTEM.md file (YAML frontmatter + markdown body), validates the schema, normalizes field names (rollout-order → rolloutOrder, breaks-if → breaksIf, rollout-edge → rolloutEdge). Throws SystemMapParseError with a specific .reason field on every validation failure, suitable for surfacing in user prompts. Exports componentForFile (longest-prefix match) and rolloutOrder (group components into ship tiers). - lib/plan-rollout/system-map-reconcile.ts Takes a parsed SystemMap and a discovered ImportEdge[]; returns ReconcileFlag[] covering three categories — import-without-contract, contract-without-imports, rollout-order-inversion. Pure function, no I/O. Leaf-util and types-only components are excluded. note: runtime-only and note: legacy suppress the contract-without-imports flag. - lib/plan-rollout/system-map-scaffolder.ts Walks a repo's src/ subdirectories (or top-level dirs if no src/), proposes one component per source-containing directory, classifies utils/lib/helpers as leaf-util and types/typings as types-only, infers a starting role from package.json description or README first paragraph. Writes SYSTEM.md.draft; never overwrites SYSTEM.md itself (refuses or drops to .draft via force). - docs/SYSTEM-MD.md Full schema documentation with worked example, field reference, reconciliation rules, and relationship to CLAUDE.md / CODEOWNERS. - test/plan-rollout/ (46 tests, all passing) parser.test.ts — valid fixtures, invalid fixtures (missing role, duplicate names, bad rollout-edge, unknown with, self-referential), error paths (missing frontmatter, malformed YAML, wrong version, empty components, non-integer rollout-order, invalid kind), CRLF handling, narrative preservation, componentForFile longest-prefix and prefix-not-a-subdir cases, rolloutOrder bucketing. reconcile.test.ts — all three flag categories, leaf-util exclusion, runtime-only suppression, bidirectional contract check, rollout-order inversion including equal-order no-flag, irrelevant-edge ignore. scaffolder.test.ts — draft generation from a fixture repo, role inference from README, TODO fallback, SYSTEM.md existence refusal, force behavior, custom outputPath, ignore rules for node_modules/dist/dotfiles, fallback when no src/, draft-cannot-be-parsed-directly property. - Adds yaml@^2.8.3 as a runtime dependency (first use in gstack; needed for SYSTEM.md frontmatter parsing). Zero changes to existing code. All existing tests still pass; the pre-existing skill-validation.test.ts failure on main is unrelated to this change (it's about compiled binaries in git). Refs: garrytan#1192 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mastermanas805
added a commit
to mastermanas805/gstack
that referenced
this pull request
May 10, 2026
Schema spec for the SYSTEM.md primitive — human-declared role contracts between components, distinct from the package/import graph (which is discovered at runtime). This PR is doc-only: 214 lines, no code, no tests, no dependencies. Library code (parser, reconciler, scaffolder) and consuming skills follow in subsequent PRs gated on the primitive landing first. Refs: garrytan#1192 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 tasks
Author
|
Child PR: #1424 — Size vs the 2164-line proposal here (after the final compression pass):
Per OSS PR-size research (SmartBear/Cisco, Google internal), review effectiveness drops sharply beyond 400 changed lines. Substantive content here is well inside the healthy band. Scope reductions from the original proposal:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What's the problem
gstack has great plan-time skills (
/plan-eng-review,/plan-ceo-review) and great ship-time skills (/ship,/review). There's a gap between them. Nothing asks "is this one PR or three?" or "in what order should these units ship?"The pain is LLM-specific. A single Claude Code session produces a 2,000-line diff across 15 files. Reviewers drown. Scope creep hides. Bugs ship under LGTM pressure. I feel this every time I use LLMs for non-trivial work. I suspect others do too.
What I'm proposing
/plan-rolloutruns after plan approval. Reads the plan plusSYSTEM.md(a new repo-root semantic contract graph) plus the discovered import graph. Producesdecomposition.md(PR stack with reader guides, dep ordering, time-budget estimates) androllout.md(rollout strategy with inverse rollback auto-generated per step)./spill-checkruns during implementation. Compares the current diff against the declared PR unit. Flags undeclared files. Adaptive: strict for code, soft for infra/meta files like CLAUDE.md, package.json, bun.lock.SYSTEM.mdis the interesting primitive. Human-declared role contracts (auth mints session tokens middleware enforces; breaks if format changes without middleware redeploy; rollout-edge hard). Separate from the package/import graph, which the LLM discovers at runtime via AST and grep. Reconciled jointly: declared contracts give the why, discovered imports give the what, disagreements surface for human resolution.Does it actually work
Dogfooded the design end to end against honojs/hono#4633 (405 Method Not Allowed). Authored SYSTEM.md for Hono's 8 components. Decomposed the issue into a 3-PR stack with graceful dep relaxation (PR-3 can merge without PR-2 via feature detection on an optional interface method).
Implemented what would be PR-1 locally. 171 LOC, 3 files, 86/86 tests pass, zero regressions across the 4 router implementations not touched.
8 design gaps surfaced during the dogfood. All folded into v1 scope. Highlights:
kindfield (component | leaf-util | types-only) so shared utility dirs don't force awkward fits.package-typefield because library rollouts (npm publish + revert) differ materially from service rollouts (coordinated deploy + state restore).The 4 PRs if this lands
lib/plan-rollout/system-map-*.ts+ tests +docs/SYSTEM-MD.md. Standalone, no skills modified./plan-rolloutskill + the helpers the SKILL.md calls. Depends on docs: add README and CLAUDE.md #1./spill-checkskill + spill classifier. Independent of refactor: reorganize codebase into modular structure #2./ship,/review,/plan-ceo-review,/plan-eng-review. Zero-regression gated ondecomposition.mdexistence.~75 min cumulative review time. PR-1 is low-risk standalone and should land first.
What I want from you
Does this shape fit gstack? In-tree or separate plugin? Any expansions I've got wrong? Convention checks: artifacts in
~/.gstack/projects/vs.gstack/in-repo? SYSTEM.md format?If this is a nope, tell me, saves us both time. If it's yes-but-shape-it, I'll rework. If yes, I'll open the 4-PR stack.
No rush. I know the queue is deep.