Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions besu/aot/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# syntax=docker/dockerfile:1
#
# Besu + Project Leyden AOT cache (benchmarking only).
#
# This is a thin derivative of one specific ethpandaops/besu build. It bakes in
# an AOT cache (besu.aot) that was generated FROM THIS EXACT image, so the JVM
# starts already warmed up instead of paying JIT/C2 warmup during a benchmark.
#
# Why this resolves the "chicken-and-egg":
# The Leyden AOT cache is validated against the besu classpath/jar, NOT against
# docker image layers. Because we FROM the precise image the cache was recorded
# against and only COPY in a data file, the jar is byte-identical and the cache
# stays valid. Adding the file does not create a "new code version" that would
# invalidate it.
#
# Built by besu/aot/generate-aot.sh — not intended to be built standalone.
ARG BASE_IMAGE
FROM ${BASE_IMAGE}

# Upstream besu runs as uid 1000 ("besu", WORKDIR /opt/besu).
COPY --chown=besu:besu besu.aot /opt/besu/aot/besu.aot

# Load the cache by default. Consumers may override BESU_OPTS.
# -Xlog:aot=info prints whether the cache loaded — handy in benchmark logs.
ENV BESU_OPTS="-XX:AOTCache=/opt/besu/aot/besu.aot -Xlog:aot=info"

LABEL io.ethpandaops.besu.aot="true"
70 changes: 70 additions & 0 deletions besu/aot/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Besu AOT-cache image (benchmarking)

Produces `ethpandaops/besu:<tag>-aot`: the normal besu build plus a baked-in
[Project Leyden](https://openjdk.org/projects/leyden/) AOT cache, so the JVM
starts pre-warmed instead of paying JIT/C2 warmup during a benchmark run.

This exists because in benchmarkoor Besu is otherwise penalised for JVM warmup
versus native clients (reth/geth). The goal is to put Besu, at startup, at the
same warmup point a long-running mainnet node would be at. **This is for
benchmarking only — not a mainnet recommendation.**

Measured by the Besu team on bal-devnet-7 (a 109 MiB cache from a short run):
first block `32.9 → 159.5 Mgas/s`, warm block `154.7 → 233.8 Mgas/s`.

## How it works

1. `besu/build.sh` builds and pushes the normal image as today.
2. When `BESU_BUILD_AOT=true`, it then calls `besu/aot/generate-aot.sh`, which:
- runs a container from the **just-built image** with
`BESU_OPTS=-XX:AOTCacheOutput=/aot/besu.aot` against a finite training
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a finite training workload ?

workload (default: `besu blocks import`). On normal JVM exit the cache is
written.
- builds `besu/aot/Dockerfile` — `FROM` that exact image — copying the cache
to `/opt/besu/aot/besu.aot` and defaulting
`BESU_OPTS=-XX:AOTCache=/opt/besu/aot/besu.aot`.
- pushes `<tag>-aot` (and the commit-pinned `<tag>-<sha>-aot`).

### The chicken-and-egg, resolved

The Besu team's concern was that baking the cache into a new image creates a new
"version" that invalidates the cache. It does not: the Leyden cache is validated
against the **besu classpath/jar**, not against docker layers. Because the
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building besu to generate a docker image will generate new jars for some of project dependencies, ex. evmTool : besu-evmTool:26.5-develop-1612ec5. So the version of the jar that was referenced by besu during training will be different from the one used during execution.

derivative `FROM`s the precise image the cache was recorded against and only adds
a data file, the jar is byte-identical and the cache stays valid. Generation and
shipping use the same jar, so there is no version skew.

## Caveats

- **Arch- and JDK-specific.** A cache is valid only for the CPU arch and JDK it
was recorded on. `generate-aot.sh` runs on the same per-platform CI runner as
the base build, so the arch matches. The multi-arch `manifest` job does **not**
currently stitch an `-aot` manifest — for now treat `<tag>-aot` as
per-platform (benchmarks run on amd64). Stitching can be added later if needed.
- **Training corpus is a real choice.** The cache only warms paths the workload
exercises. For bal-devnet-7 benchmarking, train on blocks representative of the
benchmark suites. Supply them via `BESU_AOT_BLOCKS` + `BESU_AOT_GENESIS`, or
take full control with `BESU_AOT_TRAIN_CMD`. With no training input the script
fails fast rather than shipping a useless cache.

## Local usage

```bash
export target_repository=ethpandaops/besu
export target_tag=bal-devnet-7
export BESU_AOT_BLOCKS=/path/to/bal-devnet-7-blocks.rlp
export BESU_AOT_GENESIS=/path/to/bal-devnet-7-genesis.json
BESU_BUILD_AOT=true ./besu/build.sh # or call besu/aot/generate-aot.sh directly
```

To validate without publishing, build against an already-pulled base image and
skip the push:

```bash
target_repository=ethpandaops/besu target_tag=bal-devnet-7 \
BESU_AOT_PUSH=false BESU_AOT_TRAIN_CMD="--version" \
./besu/aot/generate-aot.sh
```

Consumed in `benchmarkoor-tests` via a `besu-bal-*-aot` instance that points
`image:` at `ethpandaops/besu:bal-devnet-7-aot`.
111 changes: 111 additions & 0 deletions besu/aot/generate-aot.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
#!/usr/bin/env bash
#
# Generate a Project Leyden AOT cache for a freshly-built besu image and bake it
# into a derivative `<tag>-aot` image.
#
# Opt-in: called from besu/build.sh only when BESU_BUILD_AOT=true.
#
# The cache is recorded with the JDK 25 single-step flow (JEP 514/515):
# BESU_OPTS=-XX:AOTCacheOutput=/aot/besu.aot
# Besu runs a finite, representative workload (default: `besu blocks import`)
# and the JVM writes the cache when it exits normally. Block import is finite
# and exits 0, so no graceful-stop dance is needed.
#
# IMPORTANT: an AOT cache is specific to BOTH the besu jar AND the CPU arch +
# JDK. The base image we FROM guarantees the jar; running on the same-platform
# CI runner guarantees the arch. Do not reuse a cache across platforms.
#
# Inputs (env), most provided by build.sh / CI:
# target_repository e.g. ethpandaops/besu
# target_tag e.g. bal-devnet-7-amd64 (per-platform tag)
# source_git_commit_hash commit of the besu source (for the pinned tag)
# BESU_AOT_BLOCKS REQUIRED unless BESU_AOT_TRAIN_CMD is set: host path
# to an RLP block file, mounted at /training/blocks.rlp.
# BESU_AOT_GENESIS REQUIRED with BESU_AOT_BLOCKS: genesis.json matching
# those blocks, mounted at /training/genesis.json.
# BESU_AOT_TRAIN_CMD Override the besu args that drive training. When set,
# BESU_AOT_BLOCKS/GENESIS are not required and no
# training volumes are mounted (you manage data).
# BESU_AOT_TIMEOUT Hard cap (seconds) on the training run. Default 1800.
# BESU_AOT_PUSH Push the resulting image(s). Default true. Set to
# false for local validation (build only, no push).
set -euo pipefail

SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )

: "${target_repository:?target_repository must be set (see besu/build.sh)}"
: "${target_tag:?target_tag must be set (see besu/build.sh)}"
BESU_AOT_TIMEOUT="${BESU_AOT_TIMEOUT:-1800}"

base="${target_repository}:${target_tag}"
aot_dir="$(mktemp -d)"
trap 'rm -rf "${aot_dir}"' EXIT

echo "==> Generating AOT cache from ${base}"

# Assemble the training run. -XX:AOTCacheOutput makes the JVM dump the cache to
# /aot/besu.aot on normal exit.
train_env=( -e "BESU_OPTS=-XX:AOTCacheOutput=/aot/besu.aot -Xlog:aot=info" )
train_mounts=( -v "${aot_dir}:/aot" )

if [ -n "${BESU_AOT_TRAIN_CMD:-}" ]; then
# Caller fully controls the besu invocation.
read -r -a train_cmd <<< "${BESU_AOT_TRAIN_CMD}"
else
: "${BESU_AOT_BLOCKS:?set BESU_AOT_BLOCKS (RLP block file) or BESU_AOT_TRAIN_CMD}"
: "${BESU_AOT_GENESIS:?set BESU_AOT_GENESIS (genesis.json) or BESU_AOT_TRAIN_CMD}"
[ -f "${BESU_AOT_BLOCKS}" ] || { echo "BESU_AOT_BLOCKS not found: ${BESU_AOT_BLOCKS}"; exit 1; }
[ -f "${BESU_AOT_GENESIS}" ] || { echo "BESU_AOT_GENESIS not found: ${BESU_AOT_GENESIS}"; exit 1; }
train_mounts+=(
-v "${BESU_AOT_BLOCKS}:/training/blocks.rlp:ro"
-v "${BESU_AOT_GENESIS}:/training/genesis.json:ro"
)
train_cmd=(
--data-path=/tmp/besu-aot
--genesis-file=/training/genesis.json
blocks import --from=/training/blocks.rlp
)
fi

echo "==> Training: besu ${train_cmd[*]}"
timeout "${BESU_AOT_TIMEOUT}" docker run --rm \
"${train_env[@]}" \
"${train_mounts[@]}" \
--entrypoint besu \
"${base}" \
"${train_cmd[@]}"

if [ ! -s "${aot_dir}/besu.aot" ]; then
echo "Error: AOT cache was not produced at ${aot_dir}/besu.aot" >&2
echo " Check the -Xlog:aot=info output above; the JVM must exit normally." >&2
exit 1
fi
echo "==> AOT cache: $(du -h "${aot_dir}/besu.aot" | cut -f1)"

# Bake it into the derivative image (build context = besu/aot).
cp "${aot_dir}/besu.aot" "${SCRIPT_DIR}/besu.aot"
trap 'rm -rf "${aot_dir}"; rm -f "${SCRIPT_DIR}/besu.aot"' EXIT

aot_tag="${base}-aot"
echo "==> Building ${aot_tag}"
docker build \
--build-arg "BASE_IMAGE=${base}" \
-t "${aot_tag}" \
-f "${SCRIPT_DIR}/Dockerfile" \
"${SCRIPT_DIR}"

if [ "${BESU_AOT_PUSH:-true}" != "true" ]; then
echo "==> BESU_AOT_PUSH=${BESU_AOT_PUSH}; built ${aot_tag} but skipping push"
exit 0
fi

docker push "${aot_tag}"

# Commit-pinned tag, mirroring build.sh's convention.
if [ -n "${source_git_commit_hash:-}" ]; then
pinned="${target_repository}:${target_tag}-${source_git_commit_hash}-aot"
docker tag "${aot_tag}" "${pinned}"
docker push "${pinned}"
fi

echo "==> Done: ${aot_tag}"
7 changes: 7 additions & 0 deletions besu/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -71,3 +71,10 @@ docker tag "${tag}" "${target_repository}:${target_tag}"
docker push "${target_repository}:${target_tag}"
docker tag "${tag}" "${target_repository}:${target_tag}-${source_git_commit_hash}"
docker push "${target_repository}:${target_tag}-${source_git_commit_hash}"

# Optionally build a Project Leyden AOT-cache variant (benchmarking only).
# Opt-in via BESU_BUILD_AOT=true. See besu/aot/README.md.
if [ "${BESU_BUILD_AOT:-false}" = "true" ]; then
echo "BESU_BUILD_AOT=true — generating AOT-cache image"
"${SCRIPT_DIR}/aot/generate-aot.sh"
fi