docs(gates): pivot Phase 3 from YAML emitter to structural validator#142
Merged
githubrobbi merged 1 commit intomainfrom May 6, 2026
Merged
docs(gates): pivot Phase 3 from YAML emitter to structural validator#142githubrobbi merged 1 commit intomainfrom
githubrobbi merged 1 commit intomainfrom
Conversation
Updates `docs/architecture/gates-manifest-plan.md` to reflect a design-time decision made during Phase 3 prep. No code changes; the implementation PR follows once this lands. ## Pivot rationale The original §4.2 specified that `gen-workflow` would own the per-gate job blocks in `.github/workflows/pr-fast.yml` between `# >>> generated:gates-manifest <<<` markers, with the hand-written `classify` / `required` / `notify-failure` jobs staying outside. Investigation during Phase 3 prep (after Phase 2 landed in PR #141) showed every per-gate job in `pr-fast.yml` is bespoke: * gates-drift, file-size: bash only, no rust-cache, ~5 lines * hooks-drift: cargo run, cache shared with sanity * fmt: tiny, no cache, `if: rust` * sanity: free-disk + cache + fetch + check + conditional cargo-vet install AND run * clippy, docs: shared-cache from sanity, no free-disk * test-build: distinct cache key, free-disk * tests: shared-cache from test-build * security: distinct cache key, deny + vet * windows-lint: windows-latest runner That's eleven distinct shapes for ~thirteen pr-fast-tier gates. Encoding all of that in TOML so the generator can emit it back is a YAML-in-TOML translation problem with no real upside, AND it stakes branch protection on a hand-rolled YAML emitter. ## Revised design — structural validator (`--check` only) `gen-workflow` becomes a read-only validator that enforces five structural properties: 1. Job presence — every manifest gate with `tier="pr-fast"` has a job whose key matches the gate id (or per-tier `consumer_names` override). 2. `if:` predicate alignment — `gate_when=rust_changed` → `needs.classify.outputs.rust == 'true'`, etc. 3. Aggregator coverage — every gate id is in `required.needs:`, the `declare -A R=(...)` aggregator, AND `notify-failure.needs:`. This is the exact rename-bookkeeping failure mode that motivated the whole plan. 4. Branch-protection guard — `required.name:` is exactly `PR Fast CI / required` (the literal string in the repo's branch-protection ruleset). 5. Naming convention — every per-gate job's `name:` matches the manifest's `label`. Per-step contents (run commands, runner selection, cache strategy) are explicitly NOT validated — they're per-job craft. Phase 1's `gates-drift` covers gate-set mismatch; structural validator covers rename-bookkeeping; per-step correctness stays in code review's hands. The structural-validator design retains every drift-protection guarantee the original codegen design promised, while reducing the blast radius to "same as Phase 1's gates-drift" — the tool only reads files; it cannot break the workflow. ## What this PR contains * §4.2 rewritten — pivot rationale + 5 enforced properties + explicit list of what's NOT validated. * §4.3 (gen-docs) marked deferred — investigation showed every existing gate-matrix table in the repo is prose-laden (cells like `✅ if xwin (advisory; W5.6 upgraded from check to clippy)`); encoding the prose in TOML is strictly worse than leaving it in markdown. * §Phase 3 migration section rewritten — narrowed scope to just the workflow-drift validator, explicitly carved out the deferred sub-phases (3a `_lint_fast.sh` codegen and 3c `gen-docs`). * §Status table updated — Phase 2 ✅ landed, Phase 3 🟡 plan revision in flight, 3a/3c deferred. * §9 action log appended with the pivot entry and the rationale. ## What this PR does NOT contain No code changes. The actual `gen-workflow` Rust crate, the new `workflow-drift` manifest gate, the `pr-fast.yml::workflow-drift` job, and the `just workflow-drift` recipe all land in the follow-up implementation PR (Phase 3 proper). ## Verification * `reuse lint` — exit 0 * `taplo fmt --check` — n/a (markdown-only PR) * `actionlint` — n/a (markdown-only PR) * `bash -n` syntax checks — n/a (markdown-only PR)
githubrobbi
added a commit
that referenced
this pull request
May 7, 2026
…+ workflow-drift gate (#143) Phase 3 of `docs/architecture/gates-manifest-plan.md`. Pivoted from the originally-planned YAML emitter design (PR #142 landed the plan revision) to a `--check`-only structural validator that catches every drift class the codegen design promised at the same risk profile as Phase 1's `gates-drift` — the tool only reads files; it cannot break the workflow. ## What lands NEW workspace member `scripts/ci/gen-workflow/`: * `Cargo.toml` — workspace member, no `[lints] workspace = true` (matches the operational-tool convention from `scripts/ci-pipeline` + `scripts/ci/gen-hooks`) * `src/main.rs` — CLI (`--check`, `--verbose`, hidden `--manifest` / `--workflow` escape hatches) * `src/manifest.rs` — minimal subset of the gate-manifest schema (only the fields gen-workflow needs: `id`, `label`, `tiers`, `gate_when` → `when` via serde rename, `consumer_names`) * `src/workflow.rs` — hand-rolled minimal YAML extractor (~120 lines) covering the four fields the validator inspects: `jobs:` keys, per-job `name:`, `if:`, `needs:` (all three YAML shapes — single string, flow-style list, block-style list) * `src/validate.rs` — four structural property checks per plan §4.2 with detailed drift reports ## Why hand-roll YAML parsing The pre-existing crate options all carry material costs: * `serde_yaml` — archived 2024, unmaintained * `serde_yml` — archived 2024, active RustSec advisory in `Serializer.emitter` * `serde_yaml_ng` — active fork; depends on unmaintained `unsafe-libyaml` (C) * `serde_norway` — active fork; depends on `unsafe-libyaml-norway` (C) The validator only needs to extract a handful of specific string fields from a 600-line tame YAML file (no anchors, no complex flow style, no embedded JSON). Pulling in a heavyweight C-backed dependency, paying the cargo-vet exemption + cargo-deny advisory tax, and shipping the libyaml binary blob to every CI run is a superficial workaround for a problem that's solvable with ~120 lines of focused string-matching Rust. The hand-rolled extractor is fully fail-closed: malformed YAML, a missing `jobs:` key, a flow list without a closing bracket, or any construct outside the documented grammar yields a `Result::Err` with helpful context, and the validator surfaces that as drift rather than silently mis-classifying. ## Properties enforced (per plan §4.2) 1. Job presence — every manifest gate with `tier="pr-fast"` has a corresponding job in `pr-fast.yml` (resolved via `consumer_names["pr-fast"]` if present, else gate id). Multiple gates may fold into one job (e.g. `rustdoc` + `doc-tests` → `docs`); the validator handles many-to-one. 2. `if:` predicate alignment — for each pr-fast job, the `if:` predicate must accept every change class the folded gates' `gate_when` values require. Implemented as a `PermissiveSet` u8 bitset lattice (rust / dep / infra / always) with `union` and `contains` operations. Wider predicates pass (over-runs are fine); narrower ones fail (drift would block a gate from running on its trigger). 3. Aggregator coverage — every gate's resolved job-id must appear in `required.needs:`, the bash `declare -A R=(...)` aggregator inside the `required` job, AND `notify-failure.needs:`. This is the exact rename- bookkeeping failure mode (PR #138's windows-check → windows-lint rename touching 6 files) that motivated the whole gate-manifest plan. 4. Branch-protection guard — the `required` job's `name:` field is exactly `PR Fast CI / required` — the literal string in the repo's branch-protection rule. A future refactor that renamed it would silently break merge for every PR; the validator now hard-fails before that lands. Property 5 ("naming convention") from the plan is intentionally deferred — see plan §4.2 for the rationale (multiple gates fold into one job, so display names like `Clippy` are not 1:1 derivable from manifest fields without a schema extension). ## What is explicitly NOT validated * Job step contents — the actual `run:` commands, runner selection, cache key strategy, conditional steps. These are per-job craft; pinning them would freeze legitimate evolution (e.g. adding a `df -h /` debug step during a disk-pressure investigation). Phase 1's `gates-drift` covers gate-set mismatch; the structural validator covers rename-bookkeeping; per-step correctness stays in code review's hands. * YAML formatting — indentation, blank-line placement, comment positioning. * Job ordering — the file lists jobs in roughly cost order; the validator doesn't enforce any ordering invariant. ## Wire-up NEW manifest gate `workflow-drift`: * `tiers = ["pre-push", "pr-fast"]`, `bucket = "bg"`, `gate_when = "always"`, `hard = true`, order 27 (next to `hooks-drift`'s 26 in Bucket 1). * Self-referential — runs `gen-workflow --check` to verify the workflow against the manifest including the workflow-drift job itself. Pairs with Phase 1's `gates-drift` (gate-set drift) and Phase 2's `hooks-drift` (hook-content drift) to cover three orthogonal drift axes. NEW `pr-fast.yml::workflow-drift` job: * Same shape as `hooks-drift` (cache shared with `sanity` so the gen-workflow binary build piggybacks on the existing rust-cache). * Always-on (no classify gating). * Added to `required.needs:`, the bash R=() aggregator, AND `notify-failure.needs:` (otherwise it would itself fail Property 3 on first run — a satisfying recursive consistency check). NEW `just` recipes: * `just gen-workflow` — manual run of the validator * `just workflow-drift` — alias of `gen-workflow --check`, named for parity with `gates-drift` and `hooks-drift` MODIFIED `scripts/hooks/_lint_pre_push.sh`: * Regenerated by `gen-hooks` to pick up the new `workflow-drift` gate (Bucket 1, `cargo run -q --release -p uffs-gen-workflow -- --check`). MODIFIED `Cargo.toml`: * `scripts/ci/gen-workflow` added as a workspace member. MODIFIED `docs/architecture/gates-manifest-plan.md`: * Status table updated (Phase 3 ✅ landed). * §9 action log entry appended. ## Lint policy (audited per the no-suppression-hacks rule) * Zero `#[allow(...)]` directives in non-test code. * `expect()` / `unwrap()` only in `#[cfg(test)]` blocks (allowed by `clippy.toml`'s `allow-*-in-tests = true`). * `PermissiveSet` is a `u8` bitset — chosen at design time over a `struct { rust: bool, dep: bool, infra: bool, always: bool }` shape that would have triggered `clippy::struct_excessive_bools`. Root-cause refactor, not per-item suppression. * Regex compilation in `aggregator_table_regex` returns `Result<Regex>` and propagates via `?` rather than `unwrap` or `expect`. * `Gate::gate_when` → `Gate::when` rename eliminates `clippy::struct_field_names` at the root cause. ## Tests (33 / 33 pass) * Manifest module (7) — minimal-subset parsing, `pr_fast_gates` filtering, `consumer_names` override resolution, `gate_when` ⇄ `when` rename round-trip, unknown-fields-ignored design choice, missing-required-field-fails-noisily. * Workflow extractor (12) — all three `needs:` shapes, quoted/unquoted field values, nested `with:`/`env:`/`run: |` blocks ignored cleanly, missing `jobs:` key fails with context, empty `jobs:` block fails, malformed flow list fails with helpful message, block-list with blank-line terminator, `job_key_at` rejects inline values + accepts trailing comments, `unquote` handles single/double/bare/whitespace/ comment cases. * Validator (14) — `PermissiveSet` lattice unit tests (gate_when→set, if_expr→set, mixed-class union semantics), happy-path consistent-fixture assertion, plus seven mutation tests (one per property × failure mode): - P1 missing job - P2 too narrow predicate - P2 wider predicates pass (regression-guard) - P3 missing-from-required-needs - P3 missing-from-aggregator - P3 missing-from-notify-needs - P4 renamed required job - P4 missing required job Aggregator-extraction helper tested on a realistic-shape table and confirmed to ignore unrelated brackets (`${{ matrix.os }}` substitutions, mid-line `[other-bracket]=` lines outside the table). Every fixture is a literal string; no randomness, no time, no filesystem dependency in any test. ## Verification * `cargo test -p uffs-gen-workflow` — 33 / 33 pass * `cargo clippy -p uffs-gen-workflow -- -D pedantic -D nursery -D cargo -W unwrap_used -W expect_used -W missing_docs_in_private_items -D warnings` — exit 0, zero per-item suppressions in non-test code * `cargo run -q --release -p uffs-gen-workflow -- --check` — exit 0 * `bash scripts/ci/check_gates_drift.sh` — 22 gates matched * `cargo run -q --release -p uffs-gen-hooks -- --check` — exit 0 (hook regenerated in lockstep) * `just lint-pre-push` — full 22-gate sweep green in 53 s warm * `actionlint .github/workflows/pr-fast.yml` — exit 0 * `cargo deny check` — exit 0 (no advisories) ## What this PR does NOT do * No mutation of `pr-fast.yml` job step contents — only the additive `workflow-drift` job (5 lines of YAML matching the `hooks-drift` pattern) plus three additive list entries (`required.needs:`, R=() aggregator, `notify-failure.needs:`). * No new dependencies — the hand-rolled YAML extractor avoids the `serde_yml` advisory + the `unsafe-libyaml` C transitive. * No supply-chain churn — zero new cargo-vet exemptions added. ## Sequencing Branched off `main` post-#142. No dependency on any other in-flight PR. Closes Phase 3 of the gates manifest plan; Phases 3a (`_lint_fast.sh` codegen) and 3c (gen-docs) remain deferred per plan §Phase 3 scope notes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Documentation-only PR. Updates
docs/architecture/gates-manifest-plan.mdto reflect a design-time decision made during Phase 3 prep (after PR #141 landed Phase 2). The implementation PR follows once this lands.Why pivot
The original §4.2 specified that
gen-workflowwould own the per-gate job blocks in.github/workflows/pr-fast.ymlbetween# >>> generated:gates-manifest <<<markers, with the hand-writtenclassify/required/notify-failurejobs staying outside.Investigation during Phase 3 prep showed every per-gate job in
pr-fast.ymlis bespoke:gates-drift,file-sizehooks-driftsanityfmtif: rustsanityclippy,docssanity, no free-disktest-buildteststest-build, no free-disksecuritywindows-lintwindows-latestrunnerEleven distinct shapes for ~thirteen pr-fast-tier gates. Encoding all of that in TOML so the generator can emit it back is a YAML-in-TOML translation problem with no real upside, AND it stakes branch protection on a hand-rolled YAML emitter.
Revised design — structural validator
gen-workflowbecomes a read-only validator (--checkonly, no--write). Five structural properties enforced:tier="pr-fast"has a job whose key matches the gate id.if:predicate alignment —gate_when=rust_changed→needs.classify.outputs.rust == 'true', etc.required.needs:, thedeclare -A R=(...)aggregator, ANDnotify-failure.needs:. This is the exact rename-bookkeeping failure mode that motivated the whole plan.required.name:is exactlyPR Fast CI / required(the literal string in the repo's branch-protection ruleset).name:matches the manifest'slabel.Per-step contents (
runcommands, runner selection, cache strategy) are explicitly NOT validated — they're per-job craft. Phase 1'sgates-driftcovers gate-set mismatch; structural validator covers rename-bookkeeping; per-step correctness stays in code review's hands.The structural-validator design retains every drift-protection guarantee the original codegen design promised, at the same risk profile as Phase 1's
gates-drift— the tool only reads files; it cannot break the workflow.What this PR contains
✅ if xwin (advisory; W5.6 upgraded from check to clippy)); encoding the prose in TOML is strictly worse than leaving it in markdown._lint_fast.shcodegen and 3cgen-docs).What this PR does NOT contain
No code changes. The actual
gen-workflowRust crate, the newworkflow-driftmanifest gate, thepr-fast.yml::workflow-driftjob, and thejust workflow-driftrecipe all land in the follow-up implementation PR (Phase 3 proper).Diff size
Sequencing
Branched off
mainpost-#141. Implementation PR branches off this once it lands.