besu: opt-in Project Leyden AOT-cache image for benchmarking#387
Conversation
Adds besu/aot/ which, when BESU_BUILD_AOT=true, generates a Leyden AOT cache from the freshly-built besu image and bakes it into a derivative ethpandaops/besu:<tag>-aot image. The JVM then starts pre-warmed, so Besu isn't penalised for JIT/C2 warmup in benchmarkoor runs. The derivative FROMs the exact image the cache was recorded against and only COPYs in a data file, so the jar is byte-identical and the cache stays valid -- this resolves the version chicken-and-egg the Besu team flagged. Default builds are unchanged (opt-in). Training corpus is supplied via BESU_AOT_BLOCKS/BESU_AOT_GENESIS (or a full BESU_AOT_TRAIN_CMD override); the script fails fast without it.
🤖 qu0b-reviewerThe code is clean. The trap correctly accumulates cleanup (both the temp dir and the transient SummaryPR #387 adds opt-in Project Leyden AOT-cache support for Besu, targeting benchmarking scenarios where JVM warmup penalizes Besu vs. native clients. It introduces:
The implementation is correct: the Dockerfile One pre-integration concern: the manifest job silently skips Suggestions
Reviewed @ |
| 1. `besu/build.sh` builds and pushes the normal image as today. | ||
| 2. When `BESU_BUILD_AOT=true`, it then calls `besu/aot/generate-aot.sh`, which: | ||
| - runs a container from the **just-built image** with | ||
| `BESU_OPTS=-XX:AOTCacheOutput=/aot/besu.aot` against a finite training |
|
|
||
| The Besu team's concern was that baking the cache into a new image creates a new | ||
| "version" that invalidates the cache. It does not: the Leyden cache is validated | ||
| against the **besu classpath/jar**, not against docker layers. Because the |
There was a problem hiding this comment.
Building besu to generate a docker image will generate new jars for some of project dependencies, ex. evmTool : besu-evmTool:26.5-develop-1612ec5. So the version of the jar that was referenced by besu during training will be different from the one used during execution.
What
Adds
besu/aot/which, whenBESU_BUILD_AOT=true, generates a Project Leyden AOT cache from the freshly-built besu image and bakes it into a derivativeethpandaops/besu:<tag>-aotimage. The JVM then starts pre-warmed, so Besu isn't penalised for JIT/C2 warmup in benchmarkoor runs.besu/aot/generate-aot.sh— runs a container from the just-built image withBESU_OPTS=-XX:AOTCacheOutput=…against a finite training workload (defaultbesu blocks import), then builds + pushes the-aotimage.besu/aot/Dockerfile—FROMthat exact image,COPYs the cache to/opt/besu/aot/besu.aot, defaultsBESU_OPTS=-XX:AOTCache=….besu/build.sh— calls it only whenBESU_BUILD_AOT=true. Default builds are unchanged (opt-in).Why this resolves the version chicken-and-egg
The Besu team's concern was that baking the cache into a new image creates a new "version" that invalidates it. It doesn't: the Leyden cache is validated against the besu classpath/jar, not docker layers. The derivative
FROMs the precise image the cache was recorded against and only adds a data file, so the jar is byte-identical and the cache stays valid.Measured by the Besu team on bal-devnet-7: first block
32.9 → 159.5 Mgas/s, warm block154.7 → 233.8 Mgas/s. Benchmarking only — not a mainnet recommendation.Validation
Tested end-to-end on amd64 (Temurin 25.0.3):
-XX:AOTCacheOutputrecords a cache from the publishedethpandaops/besu:bal-devnet-7.COPY --chown=besu:besuOK.Opened AOT cache /opt/besu/aot/besu.aot→Using AOT-linked classes: true.generate-aot.shrun end-to-end withBESU_AOT_PUSH=false(build-only) → rc 0.This validates the plumbing; warmup quality depends on a representative training corpus.
Known follow-ups (intentionally out of scope)
BESU_BUILD_AOTor pass arbitrary env, so the-aotimage is only produced by a manual build or a later CI change. Opt-in/default-off keeps CI safe today.BESU_AOT_BLOCKS+BESU_AOT_GENESIS, orBESU_AOT_TRAIN_CMD. Script fails fast without it.manifestjob doesn't stitch an-aotmanifest; treat<tag>-aotas amd64 (benchmarks run amd64).Consumed in benchmarkoor-tests via a
besu-bal-*-aotinstance (separate PR).