-
Notifications
You must be signed in to change notification settings - Fork 32
besu: opt-in Project Leyden AOT-cache image for benchmarking #387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
qu0b
wants to merge
1
commit into
master
Choose a base branch
from
qu0b/besu-aot-cache
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| # syntax=docker/dockerfile:1 | ||
| # | ||
| # Besu + Project Leyden AOT cache (benchmarking only). | ||
| # | ||
| # This is a thin derivative of one specific ethpandaops/besu build. It bakes in | ||
| # an AOT cache (besu.aot) that was generated FROM THIS EXACT image, so the JVM | ||
| # starts already warmed up instead of paying JIT/C2 warmup during a benchmark. | ||
| # | ||
| # Why this resolves the "chicken-and-egg": | ||
| # The Leyden AOT cache is validated against the besu classpath/jar, NOT against | ||
| # docker image layers. Because we FROM the precise image the cache was recorded | ||
| # against and only COPY in a data file, the jar is byte-identical and the cache | ||
| # stays valid. Adding the file does not create a "new code version" that would | ||
| # invalidate it. | ||
| # | ||
| # Built by besu/aot/generate-aot.sh — not intended to be built standalone. | ||
| ARG BASE_IMAGE | ||
| FROM ${BASE_IMAGE} | ||
|
|
||
| # Upstream besu runs as uid 1000 ("besu", WORKDIR /opt/besu). | ||
| COPY --chown=besu:besu besu.aot /opt/besu/aot/besu.aot | ||
|
|
||
| # Load the cache by default. Consumers may override BESU_OPTS. | ||
| # -Xlog:aot=info prints whether the cache loaded — handy in benchmark logs. | ||
| ENV BESU_OPTS="-XX:AOTCache=/opt/besu/aot/besu.aot -Xlog:aot=info" | ||
|
|
||
| LABEL io.ethpandaops.besu.aot="true" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,70 @@ | ||
| # Besu AOT-cache image (benchmarking) | ||
|
|
||
| Produces `ethpandaops/besu:<tag>-aot`: the normal besu build plus a baked-in | ||
| [Project Leyden](https://openjdk.org/projects/leyden/) AOT cache, so the JVM | ||
| starts pre-warmed instead of paying JIT/C2 warmup during a benchmark run. | ||
|
|
||
| This exists because in benchmarkoor Besu is otherwise penalised for JVM warmup | ||
| versus native clients (reth/geth). The goal is to put Besu, at startup, at the | ||
| same warmup point a long-running mainnet node would be at. **This is for | ||
| benchmarking only — not a mainnet recommendation.** | ||
|
|
||
| Measured by the Besu team on bal-devnet-7 (a 109 MiB cache from a short run): | ||
| first block `32.9 → 159.5 Mgas/s`, warm block `154.7 → 233.8 Mgas/s`. | ||
|
|
||
| ## How it works | ||
|
|
||
| 1. `besu/build.sh` builds and pushes the normal image as today. | ||
| 2. When `BESU_BUILD_AOT=true`, it then calls `besu/aot/generate-aot.sh`, which: | ||
| - runs a container from the **just-built image** with | ||
| `BESU_OPTS=-XX:AOTCacheOutput=/aot/besu.aot` against a finite training | ||
| workload (default: `besu blocks import`). On normal JVM exit the cache is | ||
| written. | ||
| - builds `besu/aot/Dockerfile` — `FROM` that exact image — copying the cache | ||
| to `/opt/besu/aot/besu.aot` and defaulting | ||
| `BESU_OPTS=-XX:AOTCache=/opt/besu/aot/besu.aot`. | ||
| - pushes `<tag>-aot` (and the commit-pinned `<tag>-<sha>-aot`). | ||
|
|
||
| ### The chicken-and-egg, resolved | ||
|
|
||
| The Besu team's concern was that baking the cache into a new image creates a new | ||
| "version" that invalidates the cache. It does not: the Leyden cache is validated | ||
| against the **besu classpath/jar**, not against docker layers. Because the | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Building besu to generate a docker image will generate new jars for some of project dependencies, ex. evmTool : besu-evmTool:26.5-develop-1612ec5. So the version of the jar that was referenced by besu during training will be different from the one used during execution. |
||
| derivative `FROM`s the precise image the cache was recorded against and only adds | ||
| a data file, the jar is byte-identical and the cache stays valid. Generation and | ||
| shipping use the same jar, so there is no version skew. | ||
|
|
||
| ## Caveats | ||
|
|
||
| - **Arch- and JDK-specific.** A cache is valid only for the CPU arch and JDK it | ||
| was recorded on. `generate-aot.sh` runs on the same per-platform CI runner as | ||
| the base build, so the arch matches. The multi-arch `manifest` job does **not** | ||
| currently stitch an `-aot` manifest — for now treat `<tag>-aot` as | ||
| per-platform (benchmarks run on amd64). Stitching can be added later if needed. | ||
| - **Training corpus is a real choice.** The cache only warms paths the workload | ||
| exercises. For bal-devnet-7 benchmarking, train on blocks representative of the | ||
| benchmark suites. Supply them via `BESU_AOT_BLOCKS` + `BESU_AOT_GENESIS`, or | ||
| take full control with `BESU_AOT_TRAIN_CMD`. With no training input the script | ||
| fails fast rather than shipping a useless cache. | ||
|
|
||
| ## Local usage | ||
|
|
||
| ```bash | ||
| export target_repository=ethpandaops/besu | ||
| export target_tag=bal-devnet-7 | ||
| export BESU_AOT_BLOCKS=/path/to/bal-devnet-7-blocks.rlp | ||
| export BESU_AOT_GENESIS=/path/to/bal-devnet-7-genesis.json | ||
| BESU_BUILD_AOT=true ./besu/build.sh # or call besu/aot/generate-aot.sh directly | ||
| ``` | ||
|
|
||
| To validate without publishing, build against an already-pulled base image and | ||
| skip the push: | ||
|
|
||
| ```bash | ||
| target_repository=ethpandaops/besu target_tag=bal-devnet-7 \ | ||
| BESU_AOT_PUSH=false BESU_AOT_TRAIN_CMD="--version" \ | ||
| ./besu/aot/generate-aot.sh | ||
| ``` | ||
|
|
||
| Consumed in `benchmarkoor-tests` via a `besu-bal-*-aot` instance that points | ||
| `image:` at `ethpandaops/besu:bal-devnet-7-aot`. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,111 @@ | ||
| #!/usr/bin/env bash | ||
| # | ||
| # Generate a Project Leyden AOT cache for a freshly-built besu image and bake it | ||
| # into a derivative `<tag>-aot` image. | ||
| # | ||
| # Opt-in: called from besu/build.sh only when BESU_BUILD_AOT=true. | ||
| # | ||
| # The cache is recorded with the JDK 25 single-step flow (JEP 514/515): | ||
| # BESU_OPTS=-XX:AOTCacheOutput=/aot/besu.aot | ||
| # Besu runs a finite, representative workload (default: `besu blocks import`) | ||
| # and the JVM writes the cache when it exits normally. Block import is finite | ||
| # and exits 0, so no graceful-stop dance is needed. | ||
| # | ||
| # IMPORTANT: an AOT cache is specific to BOTH the besu jar AND the CPU arch + | ||
| # JDK. The base image we FROM guarantees the jar; running on the same-platform | ||
| # CI runner guarantees the arch. Do not reuse a cache across platforms. | ||
| # | ||
| # Inputs (env), most provided by build.sh / CI: | ||
| # target_repository e.g. ethpandaops/besu | ||
| # target_tag e.g. bal-devnet-7-amd64 (per-platform tag) | ||
| # source_git_commit_hash commit of the besu source (for the pinned tag) | ||
| # BESU_AOT_BLOCKS REQUIRED unless BESU_AOT_TRAIN_CMD is set: host path | ||
| # to an RLP block file, mounted at /training/blocks.rlp. | ||
| # BESU_AOT_GENESIS REQUIRED with BESU_AOT_BLOCKS: genesis.json matching | ||
| # those blocks, mounted at /training/genesis.json. | ||
| # BESU_AOT_TRAIN_CMD Override the besu args that drive training. When set, | ||
| # BESU_AOT_BLOCKS/GENESIS are not required and no | ||
| # training volumes are mounted (you manage data). | ||
| # BESU_AOT_TIMEOUT Hard cap (seconds) on the training run. Default 1800. | ||
| # BESU_AOT_PUSH Push the resulting image(s). Default true. Set to | ||
| # false for local validation (build only, no push). | ||
| set -euo pipefail | ||
|
|
||
| SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) | ||
|
|
||
| : "${target_repository:?target_repository must be set (see besu/build.sh)}" | ||
| : "${target_tag:?target_tag must be set (see besu/build.sh)}" | ||
| BESU_AOT_TIMEOUT="${BESU_AOT_TIMEOUT:-1800}" | ||
|
|
||
| base="${target_repository}:${target_tag}" | ||
| aot_dir="$(mktemp -d)" | ||
| trap 'rm -rf "${aot_dir}"' EXIT | ||
|
|
||
| echo "==> Generating AOT cache from ${base}" | ||
|
|
||
| # Assemble the training run. -XX:AOTCacheOutput makes the JVM dump the cache to | ||
| # /aot/besu.aot on normal exit. | ||
| train_env=( -e "BESU_OPTS=-XX:AOTCacheOutput=/aot/besu.aot -Xlog:aot=info" ) | ||
| train_mounts=( -v "${aot_dir}:/aot" ) | ||
|
|
||
| if [ -n "${BESU_AOT_TRAIN_CMD:-}" ]; then | ||
| # Caller fully controls the besu invocation. | ||
| read -r -a train_cmd <<< "${BESU_AOT_TRAIN_CMD}" | ||
| else | ||
| : "${BESU_AOT_BLOCKS:?set BESU_AOT_BLOCKS (RLP block file) or BESU_AOT_TRAIN_CMD}" | ||
| : "${BESU_AOT_GENESIS:?set BESU_AOT_GENESIS (genesis.json) or BESU_AOT_TRAIN_CMD}" | ||
| [ -f "${BESU_AOT_BLOCKS}" ] || { echo "BESU_AOT_BLOCKS not found: ${BESU_AOT_BLOCKS}"; exit 1; } | ||
| [ -f "${BESU_AOT_GENESIS}" ] || { echo "BESU_AOT_GENESIS not found: ${BESU_AOT_GENESIS}"; exit 1; } | ||
| train_mounts+=( | ||
| -v "${BESU_AOT_BLOCKS}:/training/blocks.rlp:ro" | ||
| -v "${BESU_AOT_GENESIS}:/training/genesis.json:ro" | ||
| ) | ||
| train_cmd=( | ||
| --data-path=/tmp/besu-aot | ||
| --genesis-file=/training/genesis.json | ||
| blocks import --from=/training/blocks.rlp | ||
| ) | ||
| fi | ||
|
|
||
| echo "==> Training: besu ${train_cmd[*]}" | ||
| timeout "${BESU_AOT_TIMEOUT}" docker run --rm \ | ||
| "${train_env[@]}" \ | ||
| "${train_mounts[@]}" \ | ||
| --entrypoint besu \ | ||
| "${base}" \ | ||
| "${train_cmd[@]}" | ||
|
|
||
| if [ ! -s "${aot_dir}/besu.aot" ]; then | ||
| echo "Error: AOT cache was not produced at ${aot_dir}/besu.aot" >&2 | ||
| echo " Check the -Xlog:aot=info output above; the JVM must exit normally." >&2 | ||
| exit 1 | ||
| fi | ||
| echo "==> AOT cache: $(du -h "${aot_dir}/besu.aot" | cut -f1)" | ||
|
|
||
| # Bake it into the derivative image (build context = besu/aot). | ||
| cp "${aot_dir}/besu.aot" "${SCRIPT_DIR}/besu.aot" | ||
| trap 'rm -rf "${aot_dir}"; rm -f "${SCRIPT_DIR}/besu.aot"' EXIT | ||
|
|
||
| aot_tag="${base}-aot" | ||
| echo "==> Building ${aot_tag}" | ||
| docker build \ | ||
| --build-arg "BASE_IMAGE=${base}" \ | ||
| -t "${aot_tag}" \ | ||
| -f "${SCRIPT_DIR}/Dockerfile" \ | ||
| "${SCRIPT_DIR}" | ||
|
|
||
| if [ "${BESU_AOT_PUSH:-true}" != "true" ]; then | ||
| echo "==> BESU_AOT_PUSH=${BESU_AOT_PUSH}; built ${aot_tag} but skipping push" | ||
| exit 0 | ||
| fi | ||
|
|
||
| docker push "${aot_tag}" | ||
|
|
||
| # Commit-pinned tag, mirroring build.sh's convention. | ||
| if [ -n "${source_git_commit_hash:-}" ]; then | ||
| pinned="${target_repository}:${target_tag}-${source_git_commit_hash}-aot" | ||
| docker tag "${aot_tag}" "${pinned}" | ||
| docker push "${pinned}" | ||
| fi | ||
|
|
||
| echo "==> Done: ${aot_tag}" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a finite training workload ?