Skip to content

feat(measure): component-aware harness with meld + code-section deltas (v0.9.1 PR-R)#114

Merged
avrabe merged 1 commit into
mainfrom
release/v0.9.1-pr-r-component-harness
May 15, 2026
Merged

feat(measure): component-aware harness with meld + code-section deltas (v0.9.1 PR-R)#114
avrabe merged 1 commit into
mainfrom
release/v0.9.1-pr-r-component-harness

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 15, 2026

Summary

User-requested fairness pass on the v0.9.0 measurement harness:

  1. Component-aware comparisons — wasm-opt can't process components directly; use meld fuse to produce a comparable core form, then run wasm-opt + LOOM on that.
  2. Code-section deltas — the right metric for optimizer effectiveness. File bytes include attestation / debug / custom sections that aren't the optimization target.
  3. Per-run timeout — concurrent harness invocations were hanging on the 2.3 MB calculator melded core for >60 min each; bounded.

Key finding

The original code_section_bytes was returning 0 for every fixture because it parsed wasm-tools dump output where the code-section size is LEB128-encoded (0a ab 06 = 811 bytes), but the parser grabbed the first integer from the address column (0x3ab0). Every prior code-section number in the report was silently zero. New parser uses wasm-tools objdump which prints decimal byte counts in a clean pipe-separated table.

Honest results matrix

File size (total bytes):

Workload Baseline LOOM wasm-opt -O3 LOOM Δ% wasm-opt Δ%
gale 1,941 1,846 1,925 −4.9% −0.8%
calculator_root 2,337,724 2,327,794 (errors on components) −0.4% n/a
simple_component 261 212 (errors) −18.8% n/a
calc_component 442 392 (errors) −11.3% n/a

Code section (optimizer-relevant):

Workload Baseline LOOM wasm-opt LOOM Δ% wasm-opt Δ%
gale 811 795 795 −2.0% −2.0%
calculator_root 106,017 96,515 n/a −9.0% n/a
simple_component 9 9 n/a 0% n/a
calc_component 33 33 n/a 0% n/a

Components via meld (fused-core baseline):

Workload meld baseline wasm-opt LOOM wasm-opt Δ% LOOM Δ%
calculator_root 2,294,685 1,927,848 (timeout) −16.0% n/a
simple_component 90 90 41 0% −54.4%
calc_component 135 135 86 0% −36.3%

What the new data shows

  1. On gale's code section LOOM and wasm-opt are TIED at −2.0%. LOOM's 4-point file-size advantage came from removing non-code custom sections, not from better code optimization. Honest narrative now possible.
  2. calculator_root's 106 KB code section dilutes to 0.4% at file level (2.3 MB). The code-only view shows LOOM does −9.0% there — significant.
  3. Components have tiny code sections (9 B, 33 B); after meld fuses adapter glue, the post-meld core is real code (90 B, 135 B) and LOOM gets −54.4% / −36.3%.
  4. wasm-opt beats LOOM at scale on the calculator core (−16% post-meld), but LOOM dominates on small adapter-heavy modules. The crossover is somewhere between 135 B and 2.3 MB.

Code changes

  • scripts/measure_corpus.sh: with_timeout wrapper, component magic-byte detection, meld fuse integration, fixed code_section_bytes (LEB128 bug), 15-field row schema with code-section columns, two report tables.
  • docs/measurements/v0.9.0-corpus-baseline.md: regenerated with real numbers.

🤖 Generated with Claude Code

…s (v0.9.1 PR-R)

Three additions, all surfaced during the 'as fair as possible' comparison
the user requested.

## 1. Component support via meld fusion

wasm-opt cannot process Component-Model components (errors on byte 0:
component magic differs from core wasm). `meld fuse` produces a single
core module from a component; that fused core is wasm-opt-ingestible.
Each component fixture now gets a meld-baseline row showing wasm-opt
and LOOM deltas relative to the meld output.

Findings:
  simple_component  90 B → wasm-opt 90 (+0%) / LOOM 41 (-54.4%)
  calc_component   135 B → wasm-opt 135 (+0%) / LOOM 86 (-36.3%)
  calculator_root  2.29 MB → wasm-opt 1.93 MB (-16%) / LOOM timeout

LOOM dominates on the small adapter-heavy core forms; wasm-opt dominates
on the large calculator_root core (where LOOM hits the timeout).

## 2. Code-section measurement (the optimizer-relevant metric)

The pre-fix code_section_bytes returned 0 for every fixture because it
parsed 'wasm-tools dump' output where the code-section size is
LEB128-encoded ('0a ab 06' = 811 bytes), but the parser was grabbing
the first integer from the address column ('0x3ab' → '0'). Replaced
with awk over 'wasm-tools objdump' which prints decimal byte counts in
a clean pipe-separated table; sums across multi-module components.

Now the report shows TWO tables:
- File size — total bytes incl. type/import/export/custom sections
- Code section — bytes the optimizer actually changes

The code view reveals what file size hid:
  gale:            811 → LOOM 795, wopt 795 (tied at -2.0%)
  calculator_root: 106017 → LOOM 96515 (-9.0%) wopt errors
  small components have 9-33 byte code sections (mostly adapter)

## 3. Per-run timeout (safety net)

with_timeout wrapper bounds each tool invocation to PER_RUN_TIMEOUT
(default 300s; CI can lower it). Built because three concurrent
'meld + LOOM optimize' invocations on the 2.3MB calculator melded
core hung for >60 minutes each before being killed by hand. Uses
gtimeout or timeout (coreutils on macOS); skip-without-bound if neither
is on PATH.

## 4. Bugfix: missing-fixture rows now match the new column schema

The two ROWS+= sites that insert n/a rows for missing fixtures were
updated to emit 15 fields (was 10) to match the new code-section
columns. Without this the n/a rows printed the fixture description in
the 'Baseline (code)' column.

Trace: REQ-3, REQ-14
@avrabe avrabe merged commit 0424b7e into main May 15, 2026
9 of 18 checks passed
@avrabe avrabe deleted the release/v0.9.1-pr-r-component-harness branch May 15, 2026 05:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant