feat(bench): criterion-based corpus baseline + wasm-opt version pinning (v1.0.0 Track E)#116
Merged
Merged
Conversation
…ng (v1.0.0 PR-S)
Adds a cargo-bench equivalent to scripts/measure_corpus.sh so the
corpus comparison matrix (LOOM vs wasm-opt -O3 vs meld+wasm-opt vs
meld+LOOM) can be produced via `cargo bench -p loom-testing --bench
corpus_baseline`. The shell script remains as the manual fallback.
What the bench does
- For every fixture in the same workload list the shell harness uses,
shells out to LOOM, wasm-opt, wasm-tools, and (for components) meld.
- Sums code-section bytes via `wasm-tools objdump` (matches the shell
harness's awk logic exactly).
- Renders a markdown report identical in shape to
docs/measurements/v0.9.0-corpus-baseline.md, to:
- stdout (so `cargo bench` log is grep-able by CI), and
- docs/measurements/v<workspace-version>-corpus-baseline.md
(so each bench run produces a versioned baseline artefact).
- Wraps each LOOM pass in `criterion::bench_function`, so timings land
in target/criterion/ alongside the markdown table.
- Times out individual tool invocations after PER_RUN_TIMEOUT (default
300s) to keep developer laptops responsive on large components.
Output rendering runs at process exit via libc `atexit`, since
criterion_main!() returns cleanly and Rust's `static` drop semantics
don't fire for non-Drop statics.
wasm-opt version pinning
- scripts/wasm-opt.pinned holds the pinned version_NNN token (initial
value: version_116, matching the wasm-opt that produced the v0.9.0
baseline). Comments explain the bump workflow.
- scripts/check_wasm_opt_version.sh is the standalone shell wrapper
that CI and developers can call to verify the pin pre-flight; it
parses both `(version_NNN)` and `version N` output forms and prints
upgrade guidance on mismatch.
- The criterion bench performs the same check in-process at startup
and surfaces the result in the report header (`pin: ... (match)`,
`pin: **MISMATCH** ...`, `pin: ... (wasm-opt not installed)`).
Pin policy
- Auto-bumping is intentionally NOT performed. We want every wasm-opt
version change in our baselines to be a deliberate, reviewable
commit so the docs/measurements/ history stays comparable.
- A mismatch is non-fatal: the bench / harness still runs, but the
generated report flags the mismatch so reviewers notice.
- If wasm-opt is missing entirely, the bench proceeds with wasm-opt
columns marked n/a -- matches scripts/measure_corpus.sh behaviour.
Reuse of existing infra
- WORKLOADS catalogue is kept in lock-step with the shell harness.
- Output naming under /tmp/loom-measure-corpus matches the shell
script so forensic outputs are discoverable from the same place.
…lobbering existing report The previous version wrote to docs/measurements/v<workspace>-corpus-baseline.md, which would overwrite the existing shell-harness baseline file. Switch to a "-criterion.md" suffix so the criterion bench and the shell script can coexist and the docs/measurements history stays comparable.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wires the corpus measurement harness as a cargo bench plus adds wasm-opt version pinning.
cargo bench -p loom-testing --bench corpus_baselinenow produces the same comparison matrix thatscripts/measure_corpus.shdoes.What lands
loom-testing/benches/corpus_baseline.rs(~870 LOC) — criterion-driven harness that:docs/measurements/v<workspace-version>-corpus-baseline-criterion.md.wasm-tools objdump(the LEB128-correct parser from PR-R).loom-testing/Cargo.toml— adds[[bench]] name = "corpus_baseline" harness = false.scripts/wasm-opt.pinned— pin file with current versionversion_116+ workflow comments.scripts/check_wasm_opt_version.sh— standalone shell pin-checker (also invoked in-process by the bench at startup).wasm-opt version pinning workflow
check_wasm_opt_pin()compareswasm-opt --versionagainstscripts/wasm-opt.pinned. Match → silent OK. Mismatch → non-fatal warning with upgrade guidance. Missing → wasm-opt columns markedn/a.scripts/wasm-opt.pinnedto the new version string. CI / human reviewers see the diff and can re-run the baseline.Run
Note
The shell
scripts/measure_corpus.shis unchanged — it remains the manual fallback for environments without cargo.🤖 Generated with Claude Code