feat(ci): LS-N verification gate (spar-pattern port)#161
Conversation
PR-time gate that enforces meld's STPA test-naming contract: every `status: approved` entry in `safety/stpa/loss-scenarios.yaml` must have at least one `#[test] fn ls_<letter>_<num>_*` regression test in `meld-core` (e.g. LS-A-11 -> `ls_a_11_*`). Adapted from spar's rivet-driven verification gate (pulseengine/spar@ba329f3d). meld has no rivet-style executable artifact, but loss-scenarios pair with regression tests by the established naming convention; this gate makes that pairing a verifiable contract. Three files: - tools/run_ls_verification.py — Python (stdlib + PyYAML). Iterates approved LS IDs, runs `cargo test --lib --no-fail-fast <prefix>` per ID, buckets results as passed / failed / missing, writes verification-results.json. - tools/post_verification_comment.py — Marker-tagged sticky PR comment upsert via GitHub REST API. Pure stdlib (urllib). First run creates the comment, subsequent runs PATCH the body. Marker: `<!-- meld-ls-verification-gate -->`. - .github/workflows/verification-gate.yml — PR + workflow_dispatch trigger. Fail-on-failure but advisory-on-missing so the 10 older approved entries with ad-hoc test names (e.g. PR #114's `test_canonical_abi_size_fixed_size_list_saturates_on_overflow` for LS-P-4) can be migrated incrementally rather than blocking every PR. Smoke-tested locally against current main: 19 approved LS, 10 passed (LS-A-7/11/15/17/18/20/12/13/14/16), 9 missing (the older v0.7.0-era and PR-#114-era entries). No failures. Same script runs locally: python3 tools/run_ls_verification.py Inputs are integer/metadata only (PR number via env, head_ref in concurrency); no untrusted free-form text from PR titles/bodies/ comments is read in run: blocks. AGENTS.md gains a "LS-N verification gate" section under "Mythos Bug-Hunt Pipeline". Refs: pulseengine/spar@ba329f3d Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Self-hosted runners (Debian/Ubuntu Python 3.12) enforce PEP 668 and reject `pip install --user pyyaml` with "externally-managed-environment". `--break-system-packages` is the documented PEP 668 opt-out for CI environments where the runner's Python install is disposable per workflow run. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
LS-N verification gate
Approved Failed LS entries(none) Missing regression tests
Updated automatically by |
The LS-N verification gate (this PR) discovered 9 approved loss-scenarios without a matching `ls_<letter>_<num>_*` regression test. Five of those already had regression tests pinning the fix under historical names; this commit adds thin convention aliases so the gate's discovery query finds them. The original tests stay in place (single source of truth, preserves git blame / grep continuity); each alias is a `#[test] fn` that delegates to the original test body. | LS | Original test | Alias | |-----|---------------|-------| | LS-P-4 | test_canonical_abi_size_fixed_size_list_saturates_on_overflow | ls_p_4_canonical_abi_size_saturates_on_overflow | | LS-P-5 | test_parser_rejects_truncated_module_section_issue_118 | ls_p_5_parser_rejects_truncated_module_section | | LS-R-10 | test_issue112_item5_intra_adapter_preserves_from_import_module | ls_r_10_intra_adapter_preserves_from_import_module | | LS-CP-3 | test_issue112_item4_sort_adapter_sites_is_canonical | ls_cp_3_sort_adapter_sites_is_canonical | | LS-A-10 | cabi_alignment_stackful_retptr_writes_i64_at_offset_8 | ls_a_10_cabi_align_retptr_writeback | Gate result drops from 10 passed / 9 missing to 15 passed / 4 missing. The remaining four (LS-CP-4, LS-A-8, LS-A-9, LS-A-19) genuinely lack regression tests and land in follow-up PRs: - LS-CP-4: DWARF passthrough emits address-incorrect debug info - LS-A-8 : Inner-list rep_func selected by HashMap iteration order - LS-A-9 : Async callback POLL falls through to YIELD path - LS-A-19: Resource import dedup uses ends_with() suffix match The LS-CP-3 alias only covers the adapter_sites-order half of the scenario; the caller_encoding_fallback half also still needs a dedicated regression test (tracked alongside LS-A-8/9/19/CP-4). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mythos delta-pass requiredThis PR modifies one or more Tier-5 source files (per Before merge, run the Mythos discover protocol on the
Why this gate exists: LS-A-10 The gate check on this PR will pass once the label is |
Mythos delta-pass: NO FINDINGSThe latest commit ( #[test]
fn ls_p_4_canonical_abi_size_saturates_on_overflow() {
test_canonical_abi_size_fixed_size_list_saturates_on_overflow();
}No production code path is modified. No new logic to scan. The Adding |
Admin-merge per #139 (smithy capacity)8 of 11 checks green; the 3 remaining This is the documented #139
Same handling as PR #159 earlier today (cap-starved fuzz queue, 50+ Admin-merge counter for #139:
Tracking the reset back into the issue separately. |
Summary
Adapts spar's rivet-driven verification gate
(pulseengine/spar@ba329f3d)
to meld's STPA loss-scenario artifacts.
PR-time gate that enforces meld's test-naming contract: every
status: approvedentry insafety/stpa/loss-scenarios.yamlmusthave at least one
#[test] fn ls_<letter>_<num>_*inmeld-core(e.g.
LS-A-11→ls_a_11_*). Posts a single sticky PR commentwith passed / failed / missing counts.
Bucket semantics
ls_<>_<n>_*testsMissing is advisory (warning, not block) so older approved entries
with ad-hoc test names can be migrated incrementally rather than
blocking every PR.
Gate state after this PR
19 approved LS entries, 15 passed / 0 failed / 4 missing.
The 5 newly-passing entries got thin convention aliases (last
commit) so their pre-existing regression tests are discoverable:
test_canonical_abi_size_fixed_size_list_saturates_on_overflowls_p_4_canonical_abi_size_saturates_on_overflowtest_parser_rejects_truncated_module_section_issue_118ls_p_5_parser_rejects_truncated_module_sectiontest_issue112_item5_intra_adapter_preserves_from_import_modulels_r_10_intra_adapter_preserves_from_import_moduletest_issue112_item4_sort_adapter_sites_is_canonicalls_cp_3_sort_adapter_sites_is_canonical(adapter-sites half only)cabi_alignment_stackful_retptr_writes_i64_at_offset_8ls_a_10_cabi_align_retptr_writebackThe 4 still-missing genuinely lack regression tests and will
be addressed in follow-up PRs (one per subsystem):
rep_funcselected by HashMap iteration orderends_with()suffix matchcaller_encoding_fallbackhalf — same family)Files
tools/run_ls_verification.py— runner (stdlib + PyYAML); local-runnabletools/post_verification_comment.py— sticky comment upsert (pure stdlib urllib).github/workflows/verification-gate.yml— workflow (PR + workflow_dispatch)meld-core/src/{parser,resolver,adapter/fact}.rs— 5 convention aliasesAGENTS.md— new "LS-N verification gate" section under Mythos pipelineCHANGELOG.md— Unreleased / Added entry.gitignore— ignore localverification-results.jsonLocal run
Test plan
🤖 Generated with Claude Code