diff --git a/CHANGELOG.md b/CHANGELOG.md
new file mode 100644
index 0000000..3812ec1
--- /dev/null
+++ b/CHANGELOG.md
@@ -0,0 +1,286 @@
+# Changelog
+
+All notable user-facing and developer-facing changes to iris.
+
+## [Unreleased] — 2026-05-03
+
+The headline of this release is a complete snapshot/rollback stack: capture
+the full machine state to disk, restore it, roll back inside a session, ship
+snapshots between machines over HTTP, and validate that any of the above
+produces deterministic results.
+
+### Added
+
+#### Snapshot system
+
+- **Save/restore/rollback** (`save_snapshot` / `load_snapshot` /
+  `ci_restore` / `ci_rollback` on `Machine`). Captures CPU, MC, IOC, HPC3,
+  REX3, RTC, EEPROM, SCSI, Seeq, and all RAM banks plus the COW disk
+  overlay. Snapshots live under `saves/<name>/`.
+- **In-memory rollback checkpoint** (Phase 2.1): `ci_rollback` skips disk
+  by replaying a cached `RollbackCheckpoint` taken at the last `ci_restore`.
+  Measured ~42 ms per rollback on M2 vs 145–213 ms for the disk path.
+- **Reflink overlay capture** (Phase 1.3): on APFS / btrfs / xfs, snapshot
+  copies of multi-GB COW overlays use `clonefile(2)` / `FICLONE` and consume
+  ~18 MB actual disk for a 4 GB apparent overlay.
+- **Auto-fork-on-restore** (Phase 2.3): `ci_restore` captures the overlay's
+  dirty-sector set so the running session can mutate the disk without
+  poisoning the parent snapshot.
+- **Scratch SCSI volume** (Phase 2.4): a host-controlled raw block device
+  for file injection/extraction without networking. Configure with
+  `scratch = true` in `iris.toml`; iris pre-formats it with a minimal SGI
+  Volume Header so IRIX surfaces it as `/dev/rdsk/dks0dNs0`. CI commands
+  `scratch-write` / `scratch-read` / `scratch-clear` / `scratch-info`. New
+  module `src/sgi_vh.rs`.
+- **Content-addressable chunked RAM** (Phase 3.1): each RAM bank and
+  framebuffer is split into 64 KB chunks, BLAKE3-hashed, stored once under
+  `saves/.cas/`. Snapshots reference chunks by hash; identical chunks
+  across snapshots share storage. A second snapshot of an unchanged
+  machine adds **zero bytes** to disk. New module `src/chunk_store.rs`.
+- **Snapshot determinism validator** (Phase 3.3): `validate <name>
+  [<n_instructions>]` loads the snapshot twice with peripheral threads
+  stopped, steps each pass `n_instructions` times in-line, and diffs the
+  resulting CPU register digests. 1M instructions in 265 ms. Surfaces
+  `load_state` field omissions, host-wallclock leakage at load time, and
+  unrestored TLB/cache structures. New module `src/validate.rs`.
+- **Snapshot library commands** (Phase 3.2):
+  - `tree` — render snapshot parent-chain hierarchy
+  - `diff <a> <b>` — per-device, per-RAM-chunk, per-COW-sector delta
+  - `gc` — sweep CAS chunks not referenced by any kept snapshot
+- **HTTP snapshot registry** (Phase 3.4): `pull <url> <name>` and `push
+  <url> <name>` ship snapshots between machines. URL layout mirrors disk
+  layout, so any static HTTP server (`python3 -m http.server` against
+  `saves/`) works as a read-only pull source. Pull validates each chunk's
+  BLAKE3 hash; push uploads chunks first and the manifest last so an
+  interrupted push never publishes an incomplete snapshot. Hand-rolled
+  HTTP/1.1 client over `std::net` — no new dependency. New module
+  `src/registry.rs`. Demonstrated 138× speedup on warm pulls (21 ms vs
+  2.9 s) thanks to local-CAS dedup.
+
+#### CI control socket
+
+`--ci` enables a Unix-domain control plane at `/tmp/iris.sock`. New
+newline-delimited JSON commands beyond the existing `start` / `quit` /
+`serial-{send,read}` / `wait-serial` / `screenshot`:
+
+- `save` / `restore` / `rollback` / `list` / `info` / `delete`
+- `validate`
+- `tree` / `diff` / `gc`
+- `scratch-write` / `scratch-read` / `scratch-clear` / `scratch-info`
+- `pull` / `push`
+
+#### Snapshot manifest
+
+A `snapshot.toml` at the top of every snapshot directory records:
+- `schema_version` (currently 3)
+- `host_arch` (cross-arch loads are refused — FPU bit-layout differs)
+- `iris_git_rev` (warns on mismatch)
+- `created_at_unix`
+- `parent` (snapshot name this was restored from, if any)
+- `description`
+- `installed_bundles`
+
+`tree` walks `parent` to render snapshot lineage; `diff` uses it to
+report what changed between two related snapshots; `gc` uses it to
+compute the live chunk set.
+
+#### Tests and validation
+
+- **Per-device round-trip property tests** (Phase 1.7): every `Saveable`
+  device has a `save_load_round_trip` test that mutates state, captures
+  v1 = `save_state()`, loads v1 into a fresh device, captures v2 =
+  `save_state()`, asserts v1 == v2. Catches `load_state` field omissions
+  before they corrupt snapshots silently. Covers 10 devices:
+  `eeprom_93c56`, `ds1x86`, `ioc`, `pit8254`, `mc`, `mips_tlb`, `ps2`,
+  `z85c30`, `wd33c93a`, `seeq8003`.
+- **CiSerialBackend regression test**: round-trips a 53-char single-line
+  `dd` command through the loopback to prevent regression of the chunked-
+  input drop bug (see Fixed below).
+- 28+ new unit tests across the new modules; all 198+ lib tests pass.
+
+### Changed
+
+- **Snapshot schema version bumped twice this release**:
+  - **v0 → v1** (Phase 1.2): added `snapshot.toml` manifest with
+    `schema_version`, `host_arch`, `parent`, etc.
+  - **v1 → v2** (Phase 2.2): per-device state moved from `*.toml` (hex
+    strings) to `*.bin` (postcard-encoded `BinValue`). cpu state file
+    shrunk 24% (3.65 MB → 2.79 MB) and parses 3.4× faster (19.7 ms → 5.8 ms).
+  - **v2 → v3** (Phase 3.1): RAM banks and framebuffers moved from raw
+    `bank{N}.bin`/`rex3_*.bin` files to the content-addressable chunk
+    store at `saves/.cas/`. Each snapshot writes a tiny `chunks.bin`
+    manifest of per-bank/per-framebuffer chunk hashes.
+  - **Backward compatibility**: load reads any of v0/v1/v2/v3; the
+    appropriate code path is dispatched off `manifest.schema_version`.
+    New saves write the highest version.
+- **`load_snapshot` refactored** into `load_snapshot_inner` (private) +
+  `load_snapshot` (public, auto-starts CPU + peripherals on return) +
+  `load_snapshot_paused` (used by the determinism validator; leaves all
+  threads stopped).
+- **`Machine::with_paused`** helper: briefly stops all device threads to
+  perform a host-side mutation (used by scratch-write etc.), then
+  resumes — but only restarts the CPU if it was running before, so
+  pre-`start` operations don't auto-launch the CPU.
+- **iris.toml**: documented `[scsi.2]` scratch-volume block (commented
+  out by default). New optional fields `scratch: bool` and `size_mb:
+  Option<u32>` on `ScsiDeviceConfig`.
+
+### Fixed
+
+- **`cp0_compare` write recalibration: synthetic clock available behind
+  `--features ci_clock`.** The previous implementation in
+  `src/mips_core.rs` measured `Instant::now()` between successive
+  Compare writes to compute a wallclock-stretched `count_step`. Two
+  passes from the same starting state would see different host
+  scheduling → different `dt_ns` → different `count_step` → different
+  timer-interrupt timing → divergent guest execution. With
+  `--features ci_clock` we swap in `dt_ns = (cycles since last Compare
+  write) * 10ns` (R4400 ~100 MIPS), giving the Phase 3.3 validator
+  `deterministic: true` at any N. Default builds keep the wallclock
+  path so interactive desktop sessions retain real-time IRIX timing.
+  Tradeoff under `ci_clock`: guest wall-clock no longer tracks host
+  wall-clock — exactly what reproducible CI wants.
+- **CiSerialBackend chunked-input loss** (Phase 3.5). The SCC channel-A
+  RX worker silently dropped bytes when its 8-byte `rx_queue` was full,
+  producing the symptom `dd if=/dev/rdsk/dks0d2s0 bs=512` arriving at
+  the IRIX shell as `dd if=/d=512`. Fixed by holding the byte in a
+  local `pending: Option<u8>` slot and retrying instead of dropping —
+  proper flow control: bytes only leave `host_to_guest` when there's
+  downstream space. Regression test `long_input_round_trips_without_loss`
+  in `src/z85c30.rs`.
+- **EEPROM round-trip**: discovered during 1.7 testing that the EEPROM
+  has 128 words (not 256). Test corrected.
+- **IOC round-trip**: `load_state` re-runs `update_interrupts()` which
+  re-derives the MAP_INT0/MAP_INT1 cascade bits in `l0_stat`/`l1_stat`.
+  Test now calls `update_interrupts` before the first save so the saved
+  state already reflects the cascade — matches what a real running
+  machine always shows.
+- **Z85c30 default constructor binds TCP** 8880/8881 on `new()`; tests
+  use `new_null()` instead so two test instances don't race on the same
+  ports. Also the right choice for CI mode (which already used it).
+
+### Deprecated / Descoped
+
+- **Persistent JIT cache** (was Phase 2.5): descoped. Interp on M2 hits
+  Indy parity (60–100 MIPS for integer code). The plan-cited 1.5–2× JIT
+  win wasn't worth the maintenance burden of an unstable JIT (still-open
+  POST hang on M2, prior Loads-tier and store-correctness issues). JIT
+  code stays mothballed behind the existing `--features jit` flag —
+  re-enable if a future workload outgrows interp.
+
+### Module map
+
+New modules under `src/`:
+
+| Module | Purpose |
+|---|---|
+| `sgi_vh.rs` | Minimal SGI Volume Header writer for the scratch volume |
+| `chunk_store.rs` | Content-addressable chunk store (BLAKE3, 64 KB) |
+| `validate.rs` | Snapshot determinism check (interp two-pass diff) |
+| `registry.rs` | Hand-rolled HTTP/1.1 client for snapshot pull/push |
+
+Existing modules with significant changes:
+
+| Module | Changes |
+|---|---|
+| `snapshot.rs` | Manifest, BinValue (postcard), ChunksManifest, write_state/read_state, write_chunks_manifest |
+| `machine.rs` | save/load/restore/rollback orchestration, with_paused, scratch_path, schema-version-aware dispatch |
+| `ci.rs` | 15+ new commands |
+| `mips_exec.rs` | step_n_inline, state_digest, CpuStateDigest |
+| `mips_core.rs` | Deterministic `cp0_compare` recalibration |
+| `cow_disk.rs` | Reflink-based overlay capture |
+| `z85c30.rs` | RX worker pending-byte hold, save_load_round_trip + long_input_round_trips_without_loss tests |
+| `config.rs` | scratch + size_mb on ScsiDeviceConfig |
+
+### Performance numbers (M2 interp)
+
+| Metric | Value |
+|---|---|
+| Cold restore (disk) | 145–213 ms |
+| In-memory rollback | 42 ms |
+| Save (warm CAS, no guest changes) | 232 ms |
+| Save (cold CAS, first save) | 851 ms |
+| 1 MB scratch-write while CPU running | 31 ms |
+| 1M-instruction determinism check | 265 ms |
+| Snapshot pull (cold local CAS) | 2.9 s / 268 MB |
+| Snapshot pull (warm local CAS) | 21 ms / 3.5 MB metadata |
+| 100 snapshots from same parent (estimated) | ~1.5 GB total vs ~27 GB without dedup |
+
+### Dependencies added
+
+- `postcard = "1"` — non-self-describing binary serde format for v2 device state and v3 chunks manifest.
+- `blake3 = "1"` — content hashing for the CAS chunk store.
+
+No HTTP client dependency added — `registry.rs` uses `std::net::TcpStream`
+directly.
+
+---
+
+### `iris-ci` wrapper binary
+
+Driving the CI socket via raw `printf … | nc -U /tmp/iris.sock` proved tedious
+and error-prone in real use (long lines, brittle JSON quoting, hand-managed
+timeouts, bs=512 foot-guns). New `iris-ci` companion binary replaces all of
+that.
+
+#### Subcommands
+
+**Direct passthroughs to socket commands:**
+`ping`, `start`, `quit`, `save`, `restore`, `rollback`, `list`, `info`,
+`delete`, `tree`, `diff`, `gc`, `validate`, `screenshot`, `pull`, `push`,
+`serial-send`, `serial-read`, `serial-wait`, `scratch read`, `scratch write`,
+`scratch clear`, `scratch info`.
+
+**High-level macros** for the multi-step rituals that dominate a real CI loop:
+
+- `iris-ci boot` — the full PROM-menu-to-login dance (start CPU + wait
+  `Option?` + send `1` + wait `IRIS console login`) in one command.
+- `iris-ci login [USER]` — sends username + handles vt100 prompt + waits for
+  `#`. Defaults to `root`.
+- `iris-ci run "<cmd>"` — sends a shell command, waits for the prompt,
+  prints just the captured stdout, returns non-zero on guest failure. Uses
+  csh `$status` by default; `--shell sh` switches to `$?`. Solves the SCC
+  echo-of-input ambiguity by waiting for `\nIRIS-CI-RC=` (only matches at
+  the start of the output line, never inside the typed-input echo line).
+- `iris-ci put HOST_FILE [--to GUEST_PATH]` — copies a host file into the
+  guest. Stages bytes in the scratch volume, drives the guest with
+  `dd if=/dev/rdsk/dks0d2s0 of=… bs=512 count=N` where N is computed
+  automatically, then truncates the destination to the original byte length
+  with `dd if=/dev/null of=… bs=1 seek=N count=0`. **The user never types
+  bs=512 or sector counts.**
+- `iris-ci get GUEST_PATH [--to HOST_FILE]` — pulls a guest file out.
+  Zeros scratch, drives the guest `dd … bs=512 conv=sync,notrunc` to write
+  with sector padding, looks up the byte count via `wc -c`, reads back
+  exactly that many bytes from scratch.
+- `iris-ci script FILE` — runs a sequence of iris-ci commands from a file
+  (one per line, `#` comments, double-quoted args). Each step prints
+  `[ok Nms] <line>` or `[FAIL Nms] <line>: <error>`. Aborts on first
+  failure with non-zero overall exit.
+
+#### Connection options
+
+- Default socket `/tmp/iris.sock`; override with `--socket PATH` or
+  `IRIS_SOCKET` environment variable.
+- `--json` for raw JSON responses (scriptable). `--quiet` for silent-on-success.
+- Exit codes: 0 success, 1 socket/connection error, 2 iris error response,
+  3 local error (file not found, etc.).
+
+#### Implementation
+
+- New binary `iris-ci` at `src/iris_ci_main.rs` (~700 lines), declared as
+  `[[bin]]` in `Cargo.toml`. No new dependencies — reuses the existing
+  `clap`, `serde_json`, and `std::os::unix::net`.
+- Single-request, single-response per invocation. Connects, sends one
+  newline-delimited JSON request, reads one line of response, shuts down
+  the write side so the server's read loop exits cleanly.
+
+#### What this replaced in the manual test runbook
+
+| Before | After |
+|---|---|
+| 6-step PROM-to-shell ritual via `printf` + `nc` | `iris-ci boot && iris-ci login` |
+| `printf '%s\n' '{"cmd":"serial-send",...}' \| nc …` | `iris-ci serial send "..."` |
+| Hand-built `dd if=… bs=512 count=K` recipes for file injection | `iris-ci put localfile.tar` |
+| Hand-built `dd … conv=sync,notrunc` + `wc -c` for extraction | `iris-ci get /tmp/foo --to ./foo.tar` |
+| Multi-line shell sequences with manual error handling | `iris-ci script tests/scenario.iris` |
+| JSON output piped through `head -c` and visually parsed | Pretty-printed tables + `--json` opt-in |
diff --git a/Cargo.toml b/Cargo.toml
index 0c4b439..1ce4d98 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -9,6 +9,11 @@ default-run = "iris"
 debug_cache = []
 developer = []
 developer_ip7 = []  # CP0 Compare/timer calibration stats and debug prints
+# Synthetic deterministic clock for CP0 Compare calibration: dt_ns derived from
+# (instructions executed) * 10ns instead of host Instant::now(). Required for
+# the snapshot determinism validator (Phase 3.3). Default OFF preserves the
+# wallclock-anchored timer that interactive desktop builds expect.
+ci_clock = []
 # Lightning: pedal-to-the-metal build — disables breakpoint checks and traceback buffer updates.
 # Incompatible with interactive debugging. For end-user / benchmarking builds only.
 lightning = []
@@ -38,9 +43,12 @@ crossbeam-utils = "0.8"
 bitfield = "0.14"
 cpal = "0.15"
 serde = { version = "1.0.228", features = ["derive"] }
+serde_json = "1.0"
 toml = "1.0.3"
-parking_lot = "0.12"
+postcard = { version = "1", features = ["alloc"] }
+blake3 = "1"
 png = "0.17"
+parking_lot = "0.12"
 spin = "0.10.0"
 gdbstub = { version = "0.7", features = ["std"] }
 gdbstub_arch = "0.3"
@@ -86,5 +94,9 @@ path = "src/main.rs"
 name = "coffdump"
 path = "src/coffdump.rs"
 
+[[bin]]
+name = "iris-ci"
+path = "src/iris_ci_main.rs"
+
 
 
diff --git a/README.md b/README.md
index 66621bc..09f722e 100644
--- a/README.md
+++ b/README.md
@@ -71,6 +71,7 @@ cargo run --release --features lightning             # disable emulator breakpoi
 cargo run --release --features jit                   # enable Cranelift MIPS JIT compiler
 cargo run --release --features rex-jit               # enable REX3 graphics JIT compiler
 cargo run --release --features tlbvmap               # enable 8k slot to tlb entry map (increases cache use but may help depending on host cpu arch)
+cargo run --release --features ci_clock              # synthetic deterministic CP0 Compare clock (CI/snapshot validator only; loses realtime desktop timing)
 cargo run --release --features lightning,rex-jit,tlbvmap     # recommended for best speed right now
 ```
 
@@ -131,6 +132,122 @@ Writes go to `scsi1.raw.overlay`. Monitor commands:
 - `cow reset` - discard all overlay writes
 
 
+## Snapshots and rollback
+
+Capture the full machine state — RAM, every device, plus the COW overlay — into
+`saves/<name>/`, and restore it later. CPU, MC, IOC, HPC3, REX3, RTC, EEPROM,
+SCSI controller, and the Seeq Ethernet chip all round-trip. Current schema
+version is 3: postcard-encoded binary device state plus content-addressable
+chunked RAM under `saves/.cas/`. A second snapshot taken from the same parent
+adds **zero bytes** to disk for any RAM region that didn't change — same
+storage model as Docker layers.
+
+From the interactive monitor (`telnet 127.0.0.1 8888`):
+```
+save base/desktop          # writes saves/base/desktop/
+load base/desktop          # restore everything (RAM, devices, disk overlay)
+```
+
+From `iris-ci` (the wrapper — see CI socket section below):
+```bash
+iris-ci save base/desktop
+iris-ci restore base/desktop          # full disk-backed reload (~150 ms cold)
+iris-ci rollback                      # in-memory rewind to last restore (~40 ms)
+iris-ci diff base/desktop tests/grep  # what changed: devices, RAM chunks, COW sectors
+iris-ci validate base/desktop -n 1000000  # bit-deterministic re-execution check (build with --features ci_clock)
+iris-ci tree                          # snapshot parent-chain hierarchy
+iris-ci gc                            # sweep CAS chunks no kept snapshot references
+iris-ci pull http://reg/snapshots/base   # fetch a snapshot from another machine
+```
+
+Two restore tiers:
+- **`restore <name>`** — full disk-backed reload. ~150 ms. Use after a hard
+  reset or to switch to a different snapshot.
+- **`rollback`** — in-memory rewind to the last `restore` checkpoint. ~40 ms,
+  no disk I/O. Use this in tight inner test loops where you keep returning to
+  the same starting state.
+
+Reflinks are used on APFS / btrfs / xfs so capturing a snapshot of a 4 GB disk
+image takes <10 ms and uses ~18 MB of actual disk.
+
+See [CHANGELOG.md](CHANGELOG.md) for the full feature set, and
+[manual_test_runbook.md](manual_test_runbook.md) for a copy-paste tour.
+
+
+## CI control socket and `iris-ci`
+
+`--ci` enables a Unix-socket control plane for headless automation, plus a
+small in-process serial backend so the harness can drive the IRIX console
+directly. The default socket path is `/tmp/iris.sock`.
+
+```
+cargo run --release --features lightning -- --ci
+```
+
+`cargo build` produces a companion binary, `iris-ci`, that's the **canonical
+way** to drive the socket. Don't bother with raw `nc` + JSON unless you're
+debugging the wrapper itself.
+
+```bash
+# In one terminal: launch iris (Newport window opens, --ci is just an extra channel)
+./target/release/iris --ci
+
+# In another terminal: drive it
+./target/release/iris-ci boot          # PROM menu → IRIS console login (one cmd)
+./target/release/iris-ci login         # send root + dismiss vt100 prompt + wait #
+./target/release/iris-ci run 'ls /'    # send shell command, get stdout + exit code
+./target/release/iris-ci save base/multiuser
+./target/release/iris-ci put localfile.tar   # copy file into guest, no bs=512 math
+./target/release/iris-ci get /tmp/out --to ./out.tar
+./target/release/iris-ci diff base mutated   # per-device + chunk + cow-sector deltas
+./target/release/iris-ci tree
+./target/release/iris-ci script tests/scenario.iris   # batch-run a sequence of cmds
+```
+
+Run `iris-ci --help` for the full list, or `iris-ci <subcmd> --help` for any
+subcommand. Every operation has a typed clap arg — no JSON quoting, no
+hand-managed timeouts.
+
+For automation that doesn't want to depend on `iris-ci`, the underlying socket
+protocol is newline-delimited JSON; `cmd` and `args` per request, `{ok, data,
+error}` per response. See `src/ci.rs` for the dispatch table.
+
+
+## Scratch volume — file injection without networking
+
+A SCSI device with `scratch = true` is a host-controlled raw block device for
+pushing files into the guest (and pulling artifacts back out) without bringing
+up NFS or anything else. iris pre-formats the underlying file with a minimal
+SGI Volume Header on first run, and exposes it inside IRIX as
+`/dev/rdsk/dks0d2s0`.
+
+Enable in `iris.toml`:
+```toml
+[scsi.2]
+path    = "scratch.raw"
+cdrom   = false
+overlay = false
+scratch = true
+size_mb = 64
+```
+
+The easy way (via `iris-ci`):
+```bash
+iris-ci put localfile.tar                 # copies host file into the guest
+iris-ci get /tmp/output.log --to ./out.log  # pulls a guest file out
+```
+
+`iris-ci put`/`get` handle the IRIX `dd bs=512` sector-alignment quirk
+transparently — they compute the right block count from the host file size,
+issue the right `dd` recipe to the guest, and truncate to the original byte
+length on the receiving end.
+
+Manual/raw paths (if you want to drive `dd` yourself):
+- Reads MUST use `bs=512` (or any 512-multiple); `bs=64` returns "I/O error".
+- Writes must be padded to `bs`; add `conv=sync` for short inputs.
+- Inside IRIX: `dd if=/dev/rdsk/dks0d2s0 bs=512 | tar xf -`
+
+
 ## Input
 
 Click the window to grab mouse and keyboard. Right Ctrl releases the grab.
@@ -148,8 +265,9 @@ getting IRIX running. These are meant for both humans and AI assistants working
 on the codebase.
 
 - `rules/jit/` - dispatch architecture, store compilation, sync, verify mode, probe tuning
-- `rules/irix/` - networking config, keyboard quirks
+- `rules/irix/` - networking config, keyboard quirks, csh + scratch raw-device gotchas
 - `rules/testing/` - disk image handling, avoiding filesystem corruption
+- `rules/snapshot/` - snapshot binary format, scratch-volume conventions, round-trip tests, CI overlay paths, **iris-ci as the canonical CI interface**
 
 If you're about to touch the JIT dispatch loop, read `rules/jit/dispatch-architecture.md`
 first. It'll save you a few days.
diff --git a/iris.toml b/iris.toml
index 1305f1f..1e40a57 100644
--- a/iris.toml
+++ b/iris.toml
@@ -13,9 +13,11 @@ no_audio = false
 # PROM ROM image (required).
 prom = "prom.bin"
 
-# Window scale factor: 1 = native resolution, 2 = 2× for HiDPI/4K monitors.
+# Window scale factor. Uses logical points (macOS) / DPI-scaled pixels, so
+# scale=2 is visibly ~2× bigger than scale=1 on every display.
+# Valid: 1, 2, 3, or 4.
 # Can also be set with the --2x command-line flag (CLI takes precedence).
-scale = 1
+scale = 2
 
 # RAM bank sizes in MB.
 # Each bank must be 0 (absent), 8, 16, 32, 64, or 128.
@@ -32,19 +34,45 @@ banks = [128, 128, 0, 0]
 
 # Internal hard disk
 [scsi.1]
-path   = "scsi1.raw"
+path   = "irix65_4g.raw"
 cdrom  = false
+overlay = true
 
 # Internal hard disk
+#[scsi.2]
+#path   = "scsi2.raw"
+#cdrom  = false
+
+# Scratch volume for host<->guest file injection without networking.
+# iris auto-creates the file at `path` if missing — first 4 KB hold a minimal
+# SGI Volume Header (so IRIX recognises the device); the rest (size_mb MB
+# minus 4 KB) is the host-controlled payload area. The CI socket exposes
+# scratch-write/scratch-read/scratch-clear/scratch-info; offsets passed to
+# those commands are relative to the payload start, so the VH is never
+# touched. The guest reads the same bytes at offset 0 of /dev/rdsk/dks0dNvol.
+# No higher-level format is imposed — typical use is a tar stream:
+#   host:  iris CI: scratch-write {host_path: "bundle.tar"}
+#   guest: dd if=/dev/rdsk/dks0d2s0 bs=512 | tar xf -
+#          tar cf - /var/log/foo | dd of=/dev/rdsk/dks0d2s0 bs=512 conv=notrunc
+#   host:  iris CI: scratch-read {to_path: "log.tar"}
+# IRIX raw block-device reads must be sector-aligned (bs must be a multiple
+# of 512); bs=64 etc returns "Read error: I/O error".
+# Implies cdrom=false, overlay=false (the volume must be host-writable, and
+# scratch contents intentionally survive snapshot rollback so a freshly
+# injected bundle isn't reverted).
 [scsi.2]
-path   = "scsi2.raw"
-cdrom  = false
+path    = "scratch.raw"
+cdrom   = false
+overlay = false
+scratch = true
+size_mb = 64
 
 # NFS share — requires unfsd on the host.
 # The shared directory is exported to the VM at 192.168.0.1:/path (standard NFS port 2049).
 # From IRIX: mount 192.168.0.1:/absolute/path /mnt
 [nfs]
 shared_dir = "./shared"
+unfsd = "/usr/local/sbin/unfsd"
 
 # Port forwarding rules — forward host ports into the guest (IRIX).
 # proto:      "tcp" or "udp"
@@ -77,7 +105,7 @@ bind       = "localhost"
 # For a single disc, set path only.
 # For a changer (cycled with "scsi eject 4" in the monitor), list all
 # ISO images in `discs`; the first entry is mounted at startup.
-[scsi.4]
-path   = "cdrom4.iso"
-cdrom  = true
-#discs = ["second.iso", "cdrom4.iso", "patches.iso"]
+#[scsi.4]
+#path   = "cdrom4.iso"
+#cdrom  = true
+##discs = ["second.iso", "cdrom4.iso", "patches.iso"]
diff --git a/manual_test_runbook.md b/manual_test_runbook.md
new file mode 100644
index 0000000..0798b95
--- /dev/null
+++ b/manual_test_runbook.md
@@ -0,0 +1,254 @@
+# Manual test runbook
+
+Copy-paste each block in order. The whole sequence runs ~5–10 minutes including
+IRIX boot. Uses the `iris-ci` wrapper, not raw `nc` — every command is one short
+line, no JSON escaping.
+
+## Setup
+
+```bash
+cd ~/projects/github/unxmaal/iris
+
+# Build (produces both `iris` and `iris-ci`)
+cargo build --release --features lightning
+
+# In iris.toml, uncomment the [scsi.2] scratch block:
+#   path = "scratch.raw"  cdrom = false  overlay = false
+#   scratch = true        size_mb = 64
+
+# Clean state from any prior run
+rm -f /tmp/iris.sock /tmp/iris-ci-*-scsi*.overlay scratch.raw 2>/dev/null
+rm -rf saves/.cas saves/test-* 2>/dev/null
+
+# Put iris-ci on PATH so the rest is shorter
+alias ci=./target/release/iris-ci
+```
+
+## Boot iris and IRIX
+
+```bash
+# Launch iris in the background (one terminal, --ci enables the control socket)
+./target/release/iris --ci > /tmp/iris.log 2>&1 &
+until [ -S /tmp/iris.sock ]; do sleep 1; done
+
+# Boot to root shell — one command replaces the 6-step PROM-menu dance
+ci boot          # ~40s on M2 interp
+ci login         # ~2s; defaults to root with no password
+```
+
+**Expected:** `boot: ready at login` followed by `login: shell ready`. Total ~42 s.
+
+---
+
+## Test 1 — Bundle install + diff
+
+Snapshot a clean baseline, inject a "bundle" via the scratch volume, install it
+in IRIX, snapshot the result, see exactly what changed. The `put` command
+handles the IRIX `dd bs=512` quirk transparently — you never type a sector count.
+
+```bash
+ci save test-1/before
+
+# Build a small "bundle" on the host
+echo "fake bundle, marker=$(date +%s)" > /tmp/bundle.txt
+tar -cf /tmp/bundle.tar -C /tmp bundle.txt
+
+# Inject it into the guest. iris-ci handles bs=512 and truncation.
+ci put /tmp/bundle.tar --to /tmp/bundle.tar
+
+# Extract in the guest
+ci run 'cd /tmp && tar xf bundle.tar'
+ci run 'cat /tmp/bundle.txt'
+
+# Snapshot post-install
+ci save test-1/after
+
+# Diff
+ci diff test-1/before test-1/after
+du -sh saves/.cas
+```
+
+**Expected:**
+- `cat /tmp/bundle.txt` echoes the `marker=` line back from the guest.
+- `diff` shows small `bank0/bank1` chunk deltas (a few %), banks 2/3 unchanged,
+  cow_diff lists new dirty sectors on scsi 1 from the tar extract, devices
+  changed includes `mc`, `cpu`, `scsi`.
+- `du -sh saves/.cas` ≈ 250–260 MB (one snapshot's worth; second snapshot
+  added almost nothing thanks to CAS dedup).
+
+---
+
+## Test 2 — Rollback inner loop
+
+The mogrix CI test loop: install bundle → run test → rollback → next bundle.
+
+```bash
+ci save test-2/clean
+ci restore test-2/clean   # arms the in-memory checkpoint
+
+for run in 1 2 3 4 5; do
+  echo "=== run $run ==="
+  ci run "echo run-$run > /tmp/run.txt && ls /tmp/run.txt"
+  T=$(date +%s%N)
+  ci rollback >/dev/null
+  T2=$(date +%s%N); echo "rollback: $(( (T2-T)/1000000 )) ms"
+  ci run 'ls /tmp/run.txt 2>&1 || echo missing'
+done
+```
+
+**Expected:**
+- Each `rollback` prints in the **40–80 ms range** — in-memory, not disk.
+- After every rollback, `ls /tmp/run.txt` says missing (or "No such file") —
+  RAM and the SCSI overlay both reverted.
+
+---
+
+## Test 3 — CAS dedup at scale
+
+Take 10 snapshots over a brief idle period and confirm disk usage barely grows.
+
+```bash
+for i in 01 02 03 04 05 06 07 08 09 10; do
+  ci run "date >> /tmp/log" >/dev/null
+  ci save test-3/snap$i >/dev/null
+  printf 'snap%s  cas=%s\n' "$i" "$(du -sh saves/.cas | cut -f1)"
+done
+
+# Delete every other one and gc
+for i in 03 05 07 09; do
+  ci delete test-3/snap$i >/dev/null
+done
+ci gc
+du -sh saves/.cas
+```
+
+**Expected:**
+- `snap01` ≈ 250 MB. Each subsequent snap adds **<5 MB** (idle guest).
+- `gc` reports `removed_chunks > 0` and `bytes_freed > 0`.
+
+---
+
+## Test 4 — Determinism check
+
+After save, two cold runs of the same instructions should reach identical state.
+
+```bash
+ci save test-4/repeatable
+ci validate test-4/repeatable -n 0           # just load → digest
+ci validate test-4/repeatable -n 1000000     # run 1M instructions twice + diff
+```
+
+**Expected:** Both runs print `deterministic for N instructions (PC=0x...)`. The
+1M run completes in ~250–300 ms.
+
+---
+
+## Test 5 — Snapshot tree
+
+`tree` shows parent-chain hierarchy.
+
+```bash
+ci save test-5/base
+ci restore test-5/base       # restoring stamps `parent` on future saves
+
+ci run 'echo bundle-A >> /tmp/log'
+ci save test-5/grep-A
+
+ci restore test-5/base
+ci run 'echo bundle-B >> /tmp/log'
+ci save test-5/grep-B
+
+ci tree
+```
+
+**Expected:** the tree shows `test-5/base` at top with `grep-A` and `grep-B`
+indented under it.
+
+---
+
+## Test 6 — Script mode
+
+Replace the test sequence above with a one-line invocation against a `.iris`
+file.
+
+```bash
+cat > /tmp/scenario.iris <<'EOF'
+# scratch volume + bundle install scenario
+ping
+save test-6/before
+put /tmp/bundle.tar --to /tmp/bundle.tar
+run "cd /tmp && tar xf bundle.tar"
+run "cat /tmp/bundle.txt"
+save test-6/after
+diff test-6/before test-6/after
+EOF
+
+ci script /tmp/scenario.iris
+```
+
+**Expected:** each step prefixed with `[ok    Nms]`, plus the natural output
+of each command (diff table, etc.). Aborts on first failure.
+
+---
+
+## Test 7 — HTTP registry pull
+
+Ship a snapshot between two "machines" (same machine, different `saves/`).
+
+```bash
+# Move our latest snapshot into a registry directory
+mkdir -p /tmp/iris-reg/snapshots /tmp/iris-reg/cas
+cp -r saves/test-1/after /tmp/iris-reg/snapshots/test-1-after
+cp -r saves/.cas/* /tmp/iris-reg/cas/
+
+# Serve it
+( cd /tmp/iris-reg && python3 -m http.server 8765 ) &
+SVR=$!
+sleep 1
+
+# Delete local + pull
+rm -rf saves/test-pulled saves/.cas
+ci pull http://127.0.0.1:8765 test-pulled
+ci pull http://127.0.0.1:8765 test-pulled    # second pull, expect 0 chunks
+
+ci restore test-pulled
+ci run 'cat /tmp/bundle.txt'
+
+# Cleanup
+kill $SVR
+rm -rf /tmp/iris-reg
+```
+
+**Expected:**
+- First pull fetches all chunks (~270 MB).
+- Second pull skips all chunks, transfers only ~3.5 MB of metadata, completes
+  in ~20 ms.
+- Restore + cat shows the bundle marker — full round-trip working.
+
+---
+
+## Cleanup
+
+```bash
+ci quit
+sleep 1
+rm -f /tmp/iris.sock /tmp/iris-ci-*-scsi*.overlay /tmp/iris.log /tmp/bundle.* scratch.raw
+rm -rf saves/.cas saves/test-* saves/test-pulled
+
+# Optionally re-comment [scsi.2] in iris.toml
+```
+
+---
+
+## What each test really proves
+
+| Test | Validates |
+|---|---|
+| Setup / boot | iris-ci wrapper + boot/login macros |
+| 1 | Scratch volume + put + diff (Phases 2.4, 3.2) |
+| 2 | In-memory rollback + COW overlay revert (Phase 2.1) |
+| 3 | CAS dedup (Phase 3.1) + gc (Phase 3.2) |
+| 4 | Snapshot determinism (Phase 3.3) — guards every future device change |
+| 5 | Parent-chain tracking (Phase 1.2) + tree (Phase 3.2) |
+| 6 | Script mode — replaces hand-managed multi-step sequences |
+| 7 | HTTP registry pull (Phase 3.4) — Docker-layer-style snapshot sharing |
diff --git a/rules/irix/irix-csh-scratch-raw-device-gotchas-when-you-cant-use-iris-ci.md b/rules/irix/irix-csh-scratch-raw-device-gotchas-when-you-cant-use-iris-ci.md
new file mode 100644
index 0000000..dbf7c9d
--- /dev/null
+++ b/rules/irix/irix-csh-scratch-raw-device-gotchas-when-you-cant-use-iris-ci.md
@@ -0,0 +1,44 @@
+# IRIX csh + scratch raw-device gotchas (when you can't use iris-ci)
+
+**Keywords:** irix,csh,bs512,scratch,dd,redirect,marker,wait,serial
+**Category:** irix
+
+# IRIX csh + scratch raw-device gotchas
+
+If you're driving the CI socket without `iris-ci` (raw `nc`, foreign language harness, etc.), these are the pitfalls the wrapper handles for you. Use the wrapper if you can.
+
+## csh redirect syntax
+
+IRIX root logs into csh. `2>&1` is sh-only and csh fails to parse it silently. Use:
+
+- `>& /dev/null` — combined stdout+stderr to /dev/null
+- `>& file` — combined stdout+stderr to file
+- `>> file` — append stdout (csh has no portable stderr-only redirect)
+
+If you need sh semantics, wrap in `sh -c "..."`.
+
+## csh echoes typed input
+
+Any string in the typed command appears in the serial buffer twice — once as the literal input echo, once expanded in the output. A wait pattern of `IRIS-CI-RC=` matches the typed line (which contains the literal `IRIS-CI-RC=$status`) before the command runs.
+
+Use `\nIRIS-CI-RC=` as the wait pattern. The typed line has the marker inline; only the output line starts a fresh line with the marker, so the newline-prefixed pattern only matches the actual output.
+
+## Raw block-device alignment
+
+`/dev/rdsk/dks0dNs0` (the scratch payload partition) requires:
+
+- **Reads** in 512-byte multiples. `dd bs=64` returns `Read error: I/O error`.
+- **Writes** padded to `bs`. From a 28-byte input, `dd bs=512 conv=sync,notrunc` zero-pads to 512.
+
+After a `dd … of=FILE bs=512 count=N` from the scratch device, the guest file is N×512 bytes — too long. Truncate to the real size with `dd if=/dev/null of=FILE bs=1 seek=<original_byte_size> count=0`.
+
+## Looking up byte counts in the guest
+
+`ls -l` column layout varies by IRIX version. Use `wc -c < FILE`, which prints just the byte count on one line and is cleanly parseable.
+
+## See also
+
+- `rules/snapshot/iris-ci-is-the-canonical-ci-socket-interface.md` — the wrapper that hides all of this
+- `rules/snapshot/scratch-scsi-volume-sgi-vh-layout-and-irix-raw-device-gotchas.md` — partition layout and SGI VH details
+- `rules/snapshot/ci-mode-overlay-path-is-tmpiris-ci-pid-scsiidoverlay.md` — where `--ci`'s COW overlay actually lives
+
diff --git a/rules/snapshot/ci-mode-overlay-path-is-tmpiris-ci-pid-scsiidoverlay.md b/rules/snapshot/ci-mode-overlay-path-is-tmpiris-ci-pid-scsiidoverlay.md
new file mode 100644
index 0000000..e9524aa
--- /dev/null
+++ b/rules/snapshot/ci-mode-overlay-path-is-tmpiris-ci-pid-scsiidoverlay.md
@@ -0,0 +1,34 @@
+# CI mode overlay path is /tmp/iris-ci-PID-scsiID.overlay
+
+**Keywords:** ci,overlay,scratch,/tmp,iris-ci,wd33c93a,cow,snapshot,debugging
+**Category:** snapshot
+
+# CI Mode Overlay Path is /tmp-Based, Not Image-Sibling
+
+When iris is invoked with `--ci`, the COW overlay file does NOT live next to the base image (`<base>.overlay`). It goes to `/tmp/iris-ci-<pid>-scsi<id>.overlay`. This isolates concurrent CI runs from each other and from any interactive session sharing the same base image.
+
+## Where it's set
+`src/machine.rs:197`:
+```rust
+let ci_overlay = format!("/tmp/iris-ci-{}-scsi{}.overlay", ci_pid, id);
+hpc3.add_scsi_device_with_overlay(id as usize, &path, dev.cdrom, discs, dev.overlay, &ci_overlay)
+```
+
+`src/wd33c93a.rs:255-258` honors the override:
+```rust
+let overlay_path = overlay_path_override
+    .map(|s| s.to_string())
+    .unwrap_or_else(|| format!("{}.overlay", path));
+```
+
+## Implications
+- `rm -f irix65_4g.raw.overlay` before launching `--ci` is a no-op.
+- To inspect the live overlay during a `--ci` run, find it via `lsof -p <iris-pid> | grep overlay`.
+- After the iris process exits, the CI overlay file remains under `/tmp` until the next reboot or manual cleanup.
+- `save_snapshot` correctly captures the CI overlay regardless of path (it routes through `cow_disk::export_overlay`, which uses `self.overlay_path`).
+
+## Verification
+```
+lsof -p $(pgrep -f 'target/release/iris.*--ci') | grep overlay
+```
+Should show: `/private/tmp/iris-ci-<pid>-scsi1.overlay`
diff --git a/rules/snapshot/iris-ci-is-the-canonical-ci-socket-interface.md b/rules/snapshot/iris-ci-is-the-canonical-ci-socket-interface.md
new file mode 100644
index 0000000..1175b2c
--- /dev/null
+++ b/rules/snapshot/iris-ci-is-the-canonical-ci-socket-interface.md
@@ -0,0 +1,61 @@
+# iris-ci is the canonical CI socket interface
+
+**Keywords:** iris-ci,wrapper,ci,socket,bs512,csh,run,put,get,boot,login
+**Category:** snapshot
+
+# iris-ci — the right way to drive the CI socket
+
+Built alongside `iris` from `src/iris_ci_main.rs`. Talks to `/tmp/iris.sock` with typed clap subcommands. Use this, not raw `nc` + JSON, for any new automation, runbook, or test scenario.
+
+## Common workflows
+
+```bash
+iris-ci boot                  # PROM menu → IRIS console login (~40s on M2)
+iris-ci login                 # send root + handle vt100 prompt + wait #
+iris-ci run 'echo hello'      # send shell command, get stdout, exit on guest failure
+iris-ci put localfile.tar     # copy host file into guest, no bs=512 math
+iris-ci get /tmp/log --to .   # pull guest file out, no conv=sync math
+iris-ci save base/desktop
+iris-ci diff a b              # per-device + chunk + cow-sector deltas
+iris-ci script tests/x.iris   # batch-run a sequence (one cmd per line, # comments)
+iris-ci pull http://reg/foo bar
+```
+
+`iris-ci --help` for the full subcommand list, `iris-ci <cmd> --help` for any subcommand.
+
+## Why not raw nc + JSON
+
+Three real bugs that bit during dogfooding and that the wrapper handles for you:
+
+### 1. csh redirect syntax
+
+IRIX root login uses csh by default. `2>&1` is sh-only. Use `>& /dev/null` for combined stdout+stderr in csh.
+
+### 2. csh echoes typed input verbatim
+
+Any wait pattern that appears in your typed command will match the input echo BEFORE the command runs. So a marker like `IRIS-CI-RC=` matches both:
+- the typed-input echo line (which contains literal `IRIS-CI-RC=$status`)
+- the actual output line (which contains `IRIS-CI-RC=0`)
+
+Wait for `\nIRIS-CI-RC=` (newline-prefixed) — only matches at the start of the OUTPUT line, never inside the typed-input echo line because the echo is on its own line with no leading `\n` immediately before the marker.
+
+### 3. IRIX raw block-device gotchas
+
+- Reads MUST use `bs=512` or any 512-multiple. `bs=64` returns `Read error: I/O error` with no SCSI-level diagnostic.
+- Writes must be padded to `bs`. From a 28-byte input file, `dd bs=512 conv=sync,notrunc` zero-pads to 512. Without `conv=sync`, the partial-block write fails.
+- After receiving via `dd … of=FILE bs=512 count=N`, the guest file is N×512 bytes — too long. Truncate with `dd if=/dev/null of=FILE bs=1 seek=ORIG count=0`.
+
+`iris-ci put` and `iris-ci get` handle all three transparently. The user passes a host filename and a guest path; the wrapper computes counts, chooses csh-correct redirects, runs `wc -c` for size lookup, and truncates as needed.
+
+## When to read the JSON directly
+
+For automation that doesn't want to depend on `iris-ci` (e.g. a test harness in another language), the underlying socket protocol is newline-delimited JSON. Each request is one JSON object with `cmd` and `args`; each response is one JSON object with `ok` and `data` or `error`. See `src/ci.rs` for the dispatch table. Don't expect to do this comfortably from a shell script — that's why iris-ci exists.
+
+## Implementation notes
+
+- Single-request, single-response per invocation. Connect, write one line, read one line, shutdown the write side so the server's read loop exits cleanly.
+- `cmd_run` waits for `\nIRIS-CI-RC=` then drains the trailing `<digits>\nIRIS N# ` to keep the next command's drain clean. Sleeps 150ms between the wait and the trailing read to let those bytes arrive.
+- `extract_run_stdout` skips the first `\n` (end of typed echo line), strips the trailing `\nIRIS-CI-RC=` marker, normalises CRLF.
+- `cmd_put` uses `dd if=/dev/null of=FILE bs=1 seek=N count=0` for truncation rather than perl; perl isn't reliably installed in IRIX 6.5.
+- `cmd_get` uses `wc -c < FILE` for size lookup. Avoids parsing `ls -l` columns which vary across IRIX versions.
+
diff --git a/rules/snapshot/per-device-saveloadsave-round-trip-is-the-regression-net.md b/rules/snapshot/per-device-saveloadsave-round-trip-is-the-regression-net.md
new file mode 100644
index 0000000..106c5b3
--- /dev/null
+++ b/rules/snapshot/per-device-saveloadsave-round-trip-is-the-regression-net.md
@@ -0,0 +1,49 @@
+# Per-device save→load→save round-trip is the regression net
+
+**Keywords:** snapshot,round-trip,save_state,load_state,regression,test,convention
+**Category:** snapshot
+
+# Round-Trip Test Convention
+
+Every device with a `Saveable` impl gets a `save_load_round_trip` test in its `#[cfg(test)] mod tests`. Catches save/load asymmetries that would otherwise corrupt snapshots silently.
+
+## Pattern
+
+```rust
+#[test]
+fn save_load_round_trip() {
+    let src = Device::new(...);
+    // 1. Mutate to non-default state.
+    {
+        let mut s = src.state.lock();
+        // ... touch fields that save_state serializes
+    }
+    let v1 = src.save_state();
+
+    let dst = Device::new(...);
+    dst.load_state(&v1).expect("load_state");
+    let v2 = dst.save_state();
+
+    assert_eq!(v1, v2, "Device save_state mismatch after load_state round-trip");
+}
+```
+
+## Conventions
+
+- **Mutate first.** Saving an all-default state proves nothing — a load that no-ops on every field will pass.
+- **Use null/CI constructors when devices bind ports.** Z85c30::new_null avoids TCP 8880/8881; Ioc::new_ci uses null backends.
+- **If load_state has a side-effect that derives state from other fields, call it on src before saving.** Example: IOC update_interrupts re-derives MAP_INT0/MAP_INT1 cascade bits in l0_stat/l1_stat from (map_stat & map_mask{0,1}). Save the post-derive state so v1 already includes the cascade — otherwise v2 differs by the cascade bits.
+- **Disable wall-clock-driven side effects.** RTC: clear TE_BIT before saving so save_state doesn't tick the host clock between v1 and v2.
+
+## What's covered
+
+eeprom_93c56, ds1x86, ioc, pit8254, mc, mips_tlb, ps2, z85c30, wd33c93a, seeq8003.
+
+## What's not (yet) covered
+
+- hpc3 — composite of nested devices. Round-trip indirectly tested via end-to-end snapshot/restore.
+- rex3 — 16 MB framebuffers + massive VC2/CMAP/XMAP state.
+- mips_exec — needs Tlb+Cache type params + Bus integration.
+
+These are exercised by the end-to-end snapshot/restore validation in the CI socket workflow.
+
diff --git a/rules/snapshot/scratch-scsi-volume-sgi-vh-layout-and-irix-raw-device-gotchas.md b/rules/snapshot/scratch-scsi-volume-sgi-vh-layout-and-irix-raw-device-gotchas.md
new file mode 100644
index 0000000..4e8a377
--- /dev/null
+++ b/rules/snapshot/scratch-scsi-volume-sgi-vh-layout-and-irix-raw-device-gotchas.md
@@ -0,0 +1,40 @@
+# Scratch SCSI volume - SGI VH layout and IRIX raw-device gotchas
+
+**Keywords:** scratch,sgi,vh,volume,header,partition,scsi,raw,dd,iris,irix,phase2.4
+**Category:** snapshot
+
+# Scratch SCSI Volume (Phase 2.4)
+
+A SCSI device with `scratch = true` in `iris.toml` is a host-controlled raw block device for file injection/extraction without networking. iris pre-formats it with a minimal SGI Volume Header.
+
+## Partition layout
+
+| Slot | Device node            | Purpose                  | Type        | first_block | nblks       |
+|------|------------------------|--------------------------|-------------|-------------|-------------|
+| 0    | /dev/rdsk/dks0dNs0     | Payload (host writes)    | PT_RAW=3    | 8           | total - 8   |
+| 8    | /dev/rdsk/dks0dNvh     | Volume header itself     | PT_VOLHDR=0 | 0           | 8           |
+| 10   | /dev/rdsk/dks0dNvol    | Whole-disk view          | PT_VOLUME=6 | 0           | total       |
+
+Slot 10 (vol) is special - by SGI convention it always starts at sector 0 regardless of first_block. Use slot 0 (s0) for payload reads.
+
+## Host wire format
+
+scratch-write and scratch-read operate on the payload area. offset = 0 means raw-byte 4096 in the underlying file (the first byte after the VH). The CI commands never touch the VH.
+
+```
+host: iris CI: scratch-write {host_path: "bundle.tar"}
+guest: dd if=/dev/rdsk/dks0d2s0 bs=512 | tar xf -
+```
+
+## IRIX gotchas
+
+1. Reads must be sector-aligned. dd bs=64 returns "Read error: I/O error" with no SCSI-level error. Use bs=512 (or any 512-multiple).
+2. Writes must be padded to bs. dd bs=512 from a 28-byte file produces "0+1 records in / 0+0 records out" with "Write error: I/O error". Add conv=sync to pad with zeros, plus conv=notrunc if you don't want to truncate the device file:
+   `dd if=/tmp/data of=/dev/rdsk/dks0d2s0 bs=512 conv=sync,notrunc`
+3. Without a valid VH at sector 0, IRIX creates the device nodes but every read returns I/O error.
+4. Checksum is required: vh_csum at offset 0x1F8 must make the sum of all 128 big-endian u32 words equal 0. iris computes this in sgi_vh::fix_csum.
+
+## When to use scratch over unfsd
+
+unfsd needs a manual build on macOS, is flaky in our experience, and requires IRIX networking before any file movement. The scratch volume works at PROM time, single-user, or any other phase.
+
diff --git a/rules/snapshot/snapshot-manifest-format-snapshottoml-schema-version1.md b/rules/snapshot/snapshot-manifest-format-snapshottoml-schema-version1.md
new file mode 100644
index 0000000..6eb0252
--- /dev/null
+++ b/rules/snapshot/snapshot-manifest-format-snapshottoml-schema-version1.md
@@ -0,0 +1,37 @@
+# Snapshot manifest format (snapshot.toml schema_version=1)
+
+**Keywords:** snapshot,manifest,schema,version,parent,host_arch,iris_git_rev,saves
+**Category:** snapshot
+
+# Snapshot Manifest (snapshot.toml)
+
+Every snapshot saved by Phase 1+ writes a `snapshot.toml` at the top of `saves/<name>/`. It is read FIRST on load so format mismatches fail fast with a clear error, before any device state is touched.
+
+## Schema
+```toml
+schema_version = 1            # u32, current = 1
+host_arch = "aarch64"         # std::env::consts::ARCH at save time
+created_at_unix = 1777764190  # u64 unix seconds
+installed_bundles = []        # Vec<String>; populated by mogrix tooling
+# optional:
+iris_git_rev = "abc123"       # from option_env!("IRIS_GIT_REV") at build time
+parent = "base/desktop"       # name of the snapshot we restored from before this save
+description = "post-mogrix"   # free-form note
+```
+
+## Load behavior (`src/machine.rs:633` `load_snapshot`)
+- **No manifest** → treated as legacy v0 with a warning. Best-effort load. (Old `saves/working*` snapshots are v0.)
+- **`schema_version > 1`** → refuse: "snapshot schema_version N is newer than this iris build supports (1)".
+- **`host_arch` mismatch** → refuse. FPU bit-layout differs cross-arch and there's no migration plumbing yet.
+- **`iris_git_rev` mismatch with current build** → warn but proceed. Snapshots are not pinned to commits.
+
+## Where it's defined
+- `src/snapshot.rs` — `Manifest` struct, `to_toml`/`from_toml`, `Snapshot::write_manifest`/`read_manifest`, `SCHEMA_VERSION` const.
+- `src/machine.rs:594` writes the manifest first thing in `save_snapshot`. `parent` is auto-set to `self.last_restore`.
+- `src/machine.rs:643-672` validates on load.
+
+## CI inspection
+- `info <name>` socket command returns the manifest plus `bytes_on_disk` for any snapshot. Legacy snapshots return `{"schema_version":0,"legacy":true}`.
+
+## Future bumps
+When a device's `save_state` format changes incompatibly, increment `SCHEMA_VERSION` and add migration logic keyed off the old version number. Don't silently break v1 readers.
diff --git a/rules/snapshot/snapshot-v2-device-state-is-postcard-encoded-binvalue-bin.md b/rules/snapshot/snapshot-v2-device-state-is-postcard-encoded-binvalue-bin.md
new file mode 100644
index 0000000..8664839
--- /dev/null
+++ b/rules/snapshot/snapshot-v2-device-state-is-postcard-encoded-binvalue-bin.md
@@ -0,0 +1,46 @@
+# Snapshot v2 device state is postcard-encoded BinValue *.bin
+
+**Keywords:** snapshot,binvalue,postcard,schema_version,binary,device,state
+**Category:** snapshot
+
+# Snapshot v2: Postcard BinValue Device State
+
+For schema_version=2 snapshots, every per-device save_state lives in `<base>.bin` (postcard-encoded) instead of `<base>.toml`.
+
+## Why a tagged enum
+
+Postcard is non-self-describing — it cannot deserialize directly into `toml::Value`, whose Deserialize impl uses `deserialize_any`. To round-trip toml::Value we mirror it as `BinValue` (tagged) in `src/snapshot.rs`:
+
+```rust
+pub enum BinValue {
+    String(String), Integer(i64), Float(f64), Boolean(bool),
+    Array(Vec<BinValue>), Table(Vec<(String, BinValue)>),
+    Datetime(String),  // ISO-8601, falls back to String on parse error
+}
+```
+
+The conversion `toml::Value` <-> `BinValue` is a single tree walk; sub-millisecond for typical device tables.
+
+## What's TOML, what's binary
+
+- TOML (text): `snapshot.toml` (manifest), `cow.toml` (overlay dirty sectors).
+- Binary (postcard): `cpu.bin`, `mc.bin`, `ioc.bin`, `scc.bin`, `pit.bin`, `ps2.bin`, `rtc.bin`, `eeprom.bin`, `scsi.bin`, `seeq.bin`, `hpc3.bin`, `rex3.bin` (when REX3 present).
+- Raw (untouched): `bank0..3.bin` (RAM), `rex3_rgb/aux.bin` (framebuffers), `scsi*.overlay` (COW).
+
+## Save/load helpers
+
+`Snapshot::write_state(base, &Value, schema_version)` picks `<base>.bin` for v2+ and `<base>.toml` for legacy. Mirror: `read_state(base, schema_version)`. v2 read also falls back to .toml when .bin is missing — half-migrated snapshots from external tooling still load.
+
+## Performance
+
+cpu state on M2 (3.6 MB cpu.toml legacy):
+- TOML parse: 19.7 ms avg
+- Postcard decode + BinValue->Value: 5.8 ms avg
+- 3.4x speedup, 24% size reduction (2.79 MB bin vs 3.65 MB toml)
+
+End-to-end cold restore: 189 ms (v1) -> 145 ms (v2), ~23%.
+
+## Backward compatibility
+
+Manifest read first. Missing manifest -> schema_version=0, loads `*.toml`. Manifest version > SCHEMA_VERSION -> hard refuse. host_arch mismatch -> hard refuse (FPU bit-layout differs cross-arch).
+
diff --git a/rust-toolchain.toml b/rust-toolchain.toml
new file mode 100644
index 0000000..2bf54c6
--- /dev/null
+++ b/rust-toolchain.toml
@@ -0,0 +1,3 @@
+[toolchain]
+channel = "nightly"
+components = ["rustc", "cargo", "rust-std", "clippy", "rustfmt"]
diff --git a/src/chunk_store.rs b/src/chunk_store.rs
new file mode 100644
index 0000000..6ee7e68
--- /dev/null
+++ b/src/chunk_store.rs
@@ -0,0 +1,321 @@
+//! Phase 3.1: content-addressable chunk store for snapshot RAM.
+//!
+//! Each snapshot's RAM banks are split into 64 KB chunks, BLAKE3-hashed,
+//! and stored as `saves/.cas/<hex2>/<hex62>` (sharded by the first byte to
+//! keep any one directory under a few thousand files). Snapshots reference
+//! chunks by hash; identical chunks across snapshots share storage. A
+//! `mogrix-bundle-test` workflow that snapshots between every install
+//! shares 95–99% of RAM with its parent, so adding a new snapshot costs
+//! only the bytes that actually changed.
+//!
+//! Layout:
+//! ```text
+//!   saves/.cas/
+//!     ab/
+//!       cd1234...beef.chunk      ← BLAKE3 hash, hex64, raw 64KB content
+//!     cd/
+//!       ef9876...cafe.chunk
+//! ```
+//!
+//! On-disk chunks are immutable (CAS). `gc(live_set)` deletes any chunk
+//! whose hash isn't referenced by a kept snapshot's manifest — cheap to run
+//! and the only way to actually free space (since `delete <name>` only
+//! removes the manifest, not the underlying chunks).
+
+use std::collections::HashSet;
+use std::fs;
+use std::io::{self, Read, Write};
+use std::path::{Path, PathBuf};
+
+/// Chunk size in bytes. 64 KB is the plan-cited sweet spot — small enough
+/// that a few-page write to RAM only dirties one chunk, large enough that
+/// per-chunk hashing + filesystem overhead doesn't dominate.
+pub const CHUNK_SIZE: usize = 64 * 1024;
+
+const CAS_DIR: &str = ".cas";
+const CHUNK_EXT: &str = "chunk";
+
+/// 32-byte BLAKE3 digest.
+pub type ChunkHash = [u8; 32];
+
+pub struct ChunkStore {
+    root: PathBuf,
+}
+
+impl ChunkStore {
+    /// `saves_dir` is e.g. `Path::new("saves")`. The chunk store lives at
+    /// `saves_dir/.cas/`.
+    pub fn new(saves_dir: impl AsRef<Path>) -> Self {
+        Self { root: saves_dir.as_ref().join(CAS_DIR) }
+    }
+
+    pub fn root(&self) -> &Path { &self.root }
+
+    /// Hash `data`, write it as `saves/.cas/<hex2>/<hex62>.chunk` if absent,
+    /// return the hash. Idempotent — concurrent saves of the same chunk are
+    /// safe; the second call is a no-op.
+    ///
+    /// Crash-safety: chunks are written to a `.tmp` sibling then renamed
+    /// (atomic on POSIX), so a partial write never appears under the final
+    /// content-addressed name. We deliberately skip per-chunk `fsync` —
+    /// 4096 fsyncs per snapshot was costing ~20 s on APFS for the first
+    /// save of a 256 MB image. If the process dies mid-save the manifest
+    /// (`chunks.bin`) hasn't been written yet, so any complete chunks are
+    /// just orphaned bytes that `gc` will sweep later.
+    pub fn put(&self, data: &[u8]) -> io::Result<ChunkHash> {
+        let hash: ChunkHash = blake3::hash(data).into();
+        let path = self.path_for(&hash);
+        if path.exists() {
+            return Ok(hash);
+        }
+        if let Some(dir) = path.parent() {
+            fs::create_dir_all(dir)?;
+        }
+        let tmp = path.with_extension("chunk.tmp");
+        {
+            let mut f = fs::File::create(&tmp)?;
+            f.write_all(data)?;
+        }
+        // Rename is atomic on POSIX. If two threads raced, the loser's
+        // rename overwrites the winner's identical content — fine.
+        fs::rename(&tmp, &path)?;
+        Ok(hash)
+    }
+
+    pub fn get(&self, hash: &ChunkHash) -> io::Result<Vec<u8>> {
+        let path = self.path_for(hash);
+        let mut f = fs::File::open(&path)?;
+        let mut data = Vec::with_capacity(CHUNK_SIZE);
+        f.read_to_end(&mut data)?;
+        Ok(data)
+    }
+
+    pub fn has(&self, hash: &ChunkHash) -> bool {
+        self.path_for(hash).exists()
+    }
+
+    /// Remove any chunk whose hash isn't in `live`. Returns (removed_count,
+    /// removed_bytes). Safe to interrupt — chunks not yet visited stay.
+    pub fn gc(&self, live: &HashSet<ChunkHash>) -> io::Result<(usize, u64)> {
+        if !self.root.is_dir() {
+            return Ok((0, 0));
+        }
+        let mut removed = 0usize;
+        let mut bytes_removed = 0u64;
+        for shard in fs::read_dir(&self.root)? {
+            let shard = shard?;
+            if !shard.file_type()?.is_dir() { continue; }
+            for chunk in fs::read_dir(shard.path())? {
+                let chunk = chunk?;
+                let path = chunk.path();
+                let Some(stem) = path.file_stem().and_then(|s| s.to_str()) else { continue };
+                let Some(hash) = parse_hex62(stem, &shard.file_name().to_string_lossy()) else { continue };
+                if !live.contains(&hash) {
+                    let size = chunk.metadata().map(|m| m.len()).unwrap_or(0);
+                    if fs::remove_file(&path).is_ok() {
+                        removed += 1;
+                        bytes_removed += size;
+                    }
+                }
+            }
+        }
+        Ok((removed, bytes_removed))
+    }
+
+    /// Total bytes occupied by the chunk store. Useful for `info` reporting.
+    pub fn total_size(&self) -> io::Result<u64> {
+        if !self.root.is_dir() { return Ok(0); }
+        let mut total = 0u64;
+        for shard in fs::read_dir(&self.root)? {
+            let shard = shard?;
+            if !shard.file_type()?.is_dir() { continue; }
+            for chunk in fs::read_dir(shard.path())? {
+                let chunk = chunk?;
+                total += chunk.metadata().map(|m| m.len()).unwrap_or(0);
+            }
+        }
+        Ok(total)
+    }
+
+    pub fn path_for(&self, hash: &ChunkHash) -> PathBuf {
+        let hex = hex_encode(hash);
+        // Shard by first byte: saves/.cas/ab/cd1234...beef.chunk
+        let (head, tail) = hex.split_at(2);
+        self.root.join(head).join(format!("{}.{}", tail, CHUNK_EXT))
+    }
+}
+
+fn hex_encode(bytes: &[u8; 32]) -> String {
+    const HEX: &[u8; 16] = b"0123456789abcdef";
+    let mut s = String::with_capacity(64);
+    for &b in bytes.iter() {
+        s.push(HEX[(b >> 4) as usize] as char);
+        s.push(HEX[(b & 0x0f) as usize] as char);
+    }
+    s
+}
+
+fn parse_hex62(tail: &str, head: &str) -> Option<ChunkHash> {
+    if tail.len() != 62 || head.len() != 2 { return None; }
+    let mut out = [0u8; 32];
+    let mut full = String::with_capacity(64);
+    full.push_str(head);
+    full.push_str(tail);
+    let bytes = full.as_bytes();
+    for i in 0..32 {
+        out[i] = (hex_nibble(bytes[i * 2])? << 4) | hex_nibble(bytes[i * 2 + 1])?;
+    }
+    Some(out)
+}
+
+fn hex_nibble(c: u8) -> Option<u8> {
+    match c {
+        b'0'..=b'9' => Some(c - b'0'),
+        b'a'..=b'f' => Some(10 + c - b'a'),
+        b'A'..=b'F' => Some(10 + c - b'A'),
+        _ => None,
+    }
+}
+
+/// Walk `words` (host-endian u32) as big-endian byte chunks of `CHUNK_SIZE`,
+/// store each chunk via `store.put`, and collect the hashes in order. The
+/// final chunk may be smaller if `words.len() * 4` isn't a multiple of
+/// `CHUNK_SIZE`. Returns the per-chunk hash list — concat'ing those chunks
+/// in order reproduces the bank's BE byte stream exactly.
+pub fn put_words_as_chunks(
+    store: &ChunkStore,
+    words: &[u32],
+) -> io::Result<Vec<ChunkHash>> {
+    let bytes_total = words.len() * 4;
+    let chunk_words = CHUNK_SIZE / 4;
+    let mut hashes = Vec::with_capacity(bytes_total.div_ceil(CHUNK_SIZE));
+    let mut buf = vec![0u8; CHUNK_SIZE];
+    let mut i = 0usize;
+    while i < words.len() {
+        let take = (words.len() - i).min(chunk_words);
+        let bytes_this_chunk = take * 4;
+        for (k, &w) in words[i..i + take].iter().enumerate() {
+            buf[k * 4..k * 4 + 4].copy_from_slice(&w.to_be_bytes());
+        }
+        let chunk_slice = &buf[..bytes_this_chunk];
+        hashes.push(store.put(chunk_slice)?);
+        i += take;
+    }
+    Ok(hashes)
+}
+
+/// Inverse of `put_words_as_chunks`. Given a hash list, fetch each chunk
+/// and decode BE bytes back into a `Vec<u32>`. Caller is responsible for
+/// cross-checking the resulting length against the bank's expected size.
+pub fn get_chunks_as_words(
+    store: &ChunkStore,
+    hashes: &[ChunkHash],
+) -> io::Result<Vec<u32>> {
+    let mut words = Vec::with_capacity(hashes.len() * (CHUNK_SIZE / 4));
+    for h in hashes {
+        let bytes = store.get(h)?;
+        if bytes.len() % 4 != 0 {
+            return Err(io::Error::new(
+                io::ErrorKind::InvalidData,
+                format!("chunk size {} not a multiple of 4", bytes.len()),
+            ));
+        }
+        for chunk in bytes.chunks_exact(4) {
+            words.push(u32::from_be_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]));
+        }
+    }
+    Ok(words)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    fn unique_tmp_dir(tag: &str) -> PathBuf {
+        let nanos = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .map(|d| d.as_nanos())
+            .unwrap_or(0);
+        let p = std::env::temp_dir().join(format!("iris-cas-{}-{}", tag, nanos));
+        fs::create_dir_all(&p).unwrap();
+        p
+    }
+
+    #[test]
+    fn put_get_round_trip() {
+        let dir = unique_tmp_dir("rt");
+        let store = ChunkStore::new(&dir);
+        let data = b"hello world chunk content here";
+        let h = store.put(data).unwrap();
+        assert!(store.has(&h));
+        assert_eq!(store.get(&h).unwrap(), data);
+        let _ = fs::remove_dir_all(&dir);
+    }
+
+    #[test]
+    fn put_dedupes_identical_content() {
+        let dir = unique_tmp_dir("dedupe");
+        let store = ChunkStore::new(&dir);
+        let data = vec![0xAB; 1024];
+        let h1 = store.put(&data).unwrap();
+        let h2 = store.put(&data).unwrap();
+        assert_eq!(h1, h2);
+        // Only one file on disk.
+        let mut count = 0;
+        for shard in fs::read_dir(&dir.join(".cas")).unwrap() {
+            for _ in fs::read_dir(shard.unwrap().path()).unwrap() {
+                count += 1;
+            }
+        }
+        assert_eq!(count, 1, "duplicate put should not write twice");
+        let _ = fs::remove_dir_all(&dir);
+    }
+
+    #[test]
+    fn put_get_words_round_trip() {
+        let dir = unique_tmp_dir("words");
+        let store = ChunkStore::new(&dir);
+        // 33 KB worth of words — exercises the partial-final-chunk path.
+        let words: Vec<u32> = (0..33 * 256).map(|i| 0x80000000_u32 ^ (i as u32)).collect();
+        let hashes = put_words_as_chunks(&store, &words).unwrap();
+        let got = get_chunks_as_words(&store, &hashes).unwrap();
+        assert_eq!(got, words);
+        let _ = fs::remove_dir_all(&dir);
+    }
+
+    #[test]
+    fn put_words_two_banks_share_zero_chunks() {
+        // Two all-zero banks should produce the same hashes — same chunk
+        // stored once, both bank manifests reference it.
+        let dir = unique_tmp_dir("zero");
+        let store = ChunkStore::new(&dir);
+        let words_a = vec![0u32; CHUNK_SIZE / 4];
+        let words_b = vec![0u32; CHUNK_SIZE / 4];
+        let h_a = put_words_as_chunks(&store, &words_a).unwrap();
+        let h_b = put_words_as_chunks(&store, &words_b).unwrap();
+        assert_eq!(h_a, h_b);
+        // One physical chunk file.
+        let mut count = 0;
+        for shard in fs::read_dir(&dir.join(".cas")).unwrap() {
+            for _ in fs::read_dir(shard.unwrap().path()).unwrap() {
+                count += 1;
+            }
+        }
+        assert_eq!(count, 1, "two zero banks must dedupe to a single chunk");
+        let _ = fs::remove_dir_all(&dir);
+    }
+
+    #[test]
+    fn gc_removes_unreferenced() {
+        let dir = unique_tmp_dir("gc");
+        let store = ChunkStore::new(&dir);
+        let h_keep = store.put(b"keep me").unwrap();
+        let _h_drop = store.put(b"drop me").unwrap();
+        let mut live = HashSet::new();
+        live.insert(h_keep);
+        let (removed, _bytes) = store.gc(&live).unwrap();
+        assert_eq!(removed, 1);
+        assert!(store.has(&h_keep));
+        let _ = fs::remove_dir_all(&dir);
+    }
+}
diff --git a/src/ci.rs b/src/ci.rs
new file mode 100644
index 0000000..34b0ddc
--- /dev/null
+++ b/src/ci.rs
@@ -0,0 +1,1014 @@
+//! CI control socket.
+//!
+//! Unix domain socket that drives the emulator for automated testing. The
+//! protocol is newline-delimited JSON, strict request/response, single client.
+//! See `ci_mode_plan.md` in the repo root.
+
+#![cfg(unix)]
+
+use std::io::{BufRead, BufReader, Write};
+use std::os::unix::net::{UnixListener, UnixStream};
+use std::sync::Arc;
+use std::thread;
+use std::time::Duration;
+
+use parking_lot::Mutex;
+use serde::{Deserialize, Serialize};
+use serde_json::Value;
+
+use crate::machine::Machine;
+use crate::rex3::Rex3;
+use crate::z85c30::CiSerialBackend;
+
+/// Set at `start_server`; consulted by `quit` so the socket file is cleaned up
+/// before `std::process::exit` (which skips Drop).
+static SOCKET_PATH: Mutex<Option<String>> = Mutex::new(None);
+
+#[derive(Deserialize)]
+struct Request {
+    cmd: String,
+    #[serde(default)]
+    args: Value,
+}
+
+#[derive(Serialize)]
+struct Response {
+    ok: bool,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    data: Option<Value>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    error: Option<String>,
+}
+
+impl Response {
+    fn ok() -> Self { Self { ok: true, data: None, error: None } }
+    fn data(v: Value) -> Self { Self { ok: true, data: Some(v), error: None } }
+    fn err(msg: impl Into<String>) -> Self {
+        Self { ok: false, data: None, error: Some(msg.into()) }
+    }
+}
+
+// ----------------------------------------------------------------------------
+// Server
+// ----------------------------------------------------------------------------
+
+/// Holder for the raw `*mut Machine` passed in from `main`. The pointer is
+/// valid for the process lifetime because `Machine` lives on main's stack.
+/// Mirrors the `SystemController` pattern in `machine.rs`.
+struct MachinePtr(*mut Machine);
+unsafe impl Send for MachinePtr {}
+unsafe impl Sync for MachinePtr {}
+
+pub struct CiServer {
+    socket_path: String,
+    machine: Arc<Mutex<MachinePtr>>,
+    ci_serial: Arc<CiSerialBackend>,
+    /// Optional in case --headless is also passed (no REX3). Screenshot
+    /// commands return an error in that case.
+    rex3: Option<Arc<Rex3>>,
+}
+
+impl Drop for CiServer {
+    fn drop(&mut self) {
+        let _ = std::fs::remove_file(&self.socket_path);
+    }
+}
+
+impl CiServer {
+    fn with_machine<R>(&self, f: impl FnOnce(&mut Machine) -> R) -> R {
+        let mut guard = self.machine.lock();
+        // SAFETY: pointer is valid for process lifetime; this mutex serializes
+        // all Machine accesses from CI command handlers. CPU/peripheral threads
+        // observe state changes only when the methods we call stop them first
+        // (ci_restore/ci_rollback do).
+        let machine = unsafe { &mut *(guard.0) };
+        f(machine)
+    }
+}
+
+/// Bind the control socket, spawn the accept thread, return a handle.
+///
+/// # Safety
+/// `machine_ptr` must remain valid for the process lifetime. Pass the address
+/// of a `Machine` owned by `main`'s stack (or a heap-pinned Box that `main`
+/// keeps alive).
+pub fn start_server(
+    machine_ptr: *mut Machine,
+    socket_path: &str,
+) -> Result<Arc<CiServer>, String> {
+    // SAFETY: caller guarantees the pointer is valid.
+    let ci_serial = unsafe { (*machine_ptr).get_ci_serial() }
+        .ok_or_else(|| "CI mode: CiSerialBackend not installed on Machine".to_string())?;
+    let rex3 = unsafe { (*machine_ptr).get_rex3() };
+
+    let path = socket_path.to_string();
+    // Clear stale socket from a previous run.
+    let _ = std::fs::remove_file(&path);
+    let listener = UnixListener::bind(&path)
+        .map_err(|e| format!("failed to bind {}: {}", path, e))?;
+
+    eprintln!("iris: --ci control socket listening at {}", path);
+
+    *SOCKET_PATH.lock() = Some(path.clone());
+
+    let server = Arc::new(CiServer {
+        socket_path: path,
+        machine: Arc::new(Mutex::new(MachinePtr(machine_ptr))),
+        ci_serial,
+        rex3,
+    });
+
+    let server_clone = server.clone();
+    thread::Builder::new()
+        .name("iris-ci-accept".into())
+        .spawn(move || {
+            for conn in listener.incoming() {
+                match conn {
+                    Ok(stream) => {
+                        let s = server_clone.clone();
+                        thread::Builder::new()
+                            .name("iris-ci-handler".into())
+                            .spawn(move || handle_client(s, stream))
+                            .ok();
+                    }
+                    Err(e) => eprintln!("iris-ci-accept: {}", e),
+                }
+            }
+        })
+        .map_err(|e| format!("failed to spawn CI accept thread: {}", e))?;
+
+    Ok(server)
+}
+
+// ----------------------------------------------------------------------------
+// Connection handling
+// ----------------------------------------------------------------------------
+
+fn handle_client(server: Arc<CiServer>, stream: UnixStream) {
+    let reader = match stream.try_clone() {
+        Ok(s) => BufReader::new(s),
+        Err(e) => {
+            eprintln!("iris-ci-handler: clone failed: {}", e);
+            return;
+        }
+    };
+    let mut writer = stream;
+    for line in reader.lines() {
+        let Ok(line) = line else { break };
+        let trimmed = line.trim();
+        if trimmed.is_empty() { continue; }
+
+        let response = match serde_json::from_str::<Request>(trimmed) {
+            Ok(req) => dispatch(&server, &req),
+            Err(e) => Response::err(format!("invalid json: {}", e)),
+        };
+
+        let mut out = match serde_json::to_vec(&response) {
+            Ok(v) => v,
+            Err(e) => serde_json::to_vec(&Response::err(format!("encode: {}", e))).unwrap_or_default(),
+        };
+        out.push(b'\n');
+        if writer.write_all(&out).is_err() { break; }
+    }
+}
+
+// ----------------------------------------------------------------------------
+// Dispatch
+// ----------------------------------------------------------------------------
+
+fn dispatch(server: &CiServer, req: &Request) -> Response {
+    match req.cmd.as_str() {
+        "ping" => Response::ok(),
+        "quit" => cmd_quit(),
+        "start" => cmd_start(server),
+        "save" => cmd_save(server, &req.args),
+        "restore" => cmd_restore(server, &req.args),
+        "rollback" => cmd_rollback(server),
+        "list" => cmd_list(&req.args),
+        "info" => cmd_info(&req.args),
+        "delete" => cmd_delete(&req.args),
+        "serial-send" => cmd_serial_send(server, &req.args),
+        "serial-read" => cmd_serial_read(server),
+        "wait-serial" => cmd_wait_serial(server, &req.args),
+        "screenshot" => cmd_screenshot(server, &req.args),
+        "scratch-write" => cmd_scratch_write(server, &req.args),
+        "scratch-read"  => cmd_scratch_read(server, &req.args),
+        "scratch-clear" => cmd_scratch_clear(server),
+        "scratch-info"  => cmd_scratch_info(server),
+        "validate"      => cmd_validate(server, &req.args),
+        "gc"            => cmd_gc(),
+        "diff"          => cmd_diff(&req.args),
+        "tree"          => cmd_tree(),
+        "pull"          => cmd_pull(&req.args),
+        "push"          => cmd_push(&req.args),
+        other => Response::err(format!("unknown command: {}", other)),
+    }
+}
+
+fn cmd_quit() -> Response {
+    // Schedule process exit after a brief delay so the response flushes.
+    thread::spawn(|| {
+        thread::sleep(Duration::from_millis(50));
+        if let Some(p) = SOCKET_PATH.lock().take() {
+            let _ = std::fs::remove_file(&p);
+        }
+        std::process::exit(0);
+    });
+    Response::ok()
+}
+
+fn cmd_start(server: &CiServer) -> Response {
+    server.with_machine(|m| m.cpu_start());
+    Response::ok()
+}
+
+fn cmd_save(server: &CiServer, args: &Value) -> Response {
+    let name = match args.get("name").and_then(|v| v.as_str()) {
+        Some(n) => n.to_string(),
+        None => return Response::err("save: missing 'name' arg"),
+    };
+    match server.with_machine(|m| m.save_snapshot(&name)) {
+        Ok(()) => Response::ok(),
+        Err(e) => Response::err(format!("save failed: {}", e)),
+    }
+}
+
+fn cmd_restore(server: &CiServer, args: &Value) -> Response {
+    let name = match args.get("name").and_then(|v| v.as_str()) {
+        Some(n) => n.to_string(),
+        None => return Response::err("restore: missing 'name' arg"),
+    };
+    match server.with_machine(|m| m.ci_restore(&name)) {
+        Ok(()) => Response::ok(),
+        Err(e) => Response::err(format!("restore failed: {}", e)),
+    }
+}
+
+fn cmd_rollback(server: &CiServer) -> Response {
+    match server.with_machine(|m| m.ci_rollback()) {
+        Ok(()) => Response::ok(),
+        Err(e) => Response::err(format!("rollback failed: {}", e)),
+    }
+}
+
+fn cmd_list(_args: &Value) -> Response {
+    // Walk saves/ recursively, return every directory that contains a
+    // snapshot.toml (current format) OR a cpu.toml (legacy v0). Names are
+    // returned slash-joined relative to saves/.
+    let root = std::path::Path::new("saves");
+    if !root.is_dir() {
+        return Response::data(serde_json::json!({ "snapshots": [] }));
+    }
+    let mut out: Vec<String> = Vec::new();
+    let mut stack: Vec<std::path::PathBuf> = vec![root.to_path_buf()];
+    while let Some(dir) = stack.pop() {
+        let entries = match std::fs::read_dir(&dir) {
+            Ok(e) => e,
+            Err(_) => continue,
+        };
+        let mut subdirs: Vec<std::path::PathBuf> = Vec::new();
+        let mut is_snapshot = false;
+        for e in entries.flatten() {
+            let p = e.path();
+            if p.is_dir() {
+                subdirs.push(p);
+            } else if let Some(name) = p.file_name().and_then(|n| n.to_str()) {
+                if name == "snapshot.toml" || name == "cpu.toml" {
+                    is_snapshot = true;
+                }
+            }
+        }
+        if is_snapshot {
+            if let Ok(rel) = dir.strip_prefix(root) {
+                let s = rel.to_string_lossy().replace('\\', "/");
+                if !s.is_empty() {
+                    out.push(s);
+                }
+            }
+        }
+        for s in subdirs {
+            stack.push(s);
+        }
+    }
+    out.sort();
+    Response::data(serde_json::json!({ "snapshots": out }))
+}
+
+fn cmd_info(args: &Value) -> Response {
+    let name = match args.get("name").and_then(|v| v.as_str()) {
+        Some(n) => n,
+        None => return Response::err("info: missing 'name' arg"),
+    };
+    let dir = std::path::Path::new("saves").join(name);
+    if !dir.is_dir() {
+        return Response::err(format!("info: snapshot '{}' not found", name));
+    }
+    let snap = crate::snapshot::Snapshot::new(&dir);
+    let manifest = match snap.read_manifest() {
+        Ok(Some(m)) => Some(m),
+        Ok(None) => None,
+        Err(e) => return Response::err(format!("info: manifest read failed: {}", e)),
+    };
+
+    // Disk usage rollup: sum file sizes inside the snapshot dir.
+    let mut bytes_on_disk: u64 = 0;
+    if let Ok(walker) = std::fs::read_dir(&dir) {
+        for e in walker.flatten() {
+            if let Ok(meta) = e.metadata() {
+                if meta.is_file() {
+                    bytes_on_disk += meta.len();
+                }
+            }
+        }
+    }
+
+    let mut out = serde_json::Map::new();
+    out.insert("name".into(), Value::String(name.to_string()));
+    out.insert("bytes_on_disk".into(), Value::Number(bytes_on_disk.into()));
+    if let Some(m) = manifest {
+        out.insert("schema_version".into(), Value::Number(m.schema_version.into()));
+        out.insert("host_arch".into(), Value::String(m.host_arch));
+        out.insert("created_at_unix".into(), Value::Number(m.created_at_unix.into()));
+        if let Some(rev) = m.iris_git_rev { out.insert("iris_git_rev".into(), Value::String(rev)); }
+        if let Some(p) = m.parent { out.insert("parent".into(), Value::String(p)); }
+        if let Some(d) = m.description { out.insert("description".into(), Value::String(d)); }
+        out.insert("installed_bundles".into(),
+            Value::Array(m.installed_bundles.into_iter().map(Value::String).collect()));
+    } else {
+        out.insert("schema_version".into(), Value::Number(0.into()));
+        out.insert("legacy".into(), Value::Bool(true));
+    }
+    Response::data(Value::Object(out))
+}
+
+fn cmd_delete(args: &Value) -> Response {
+    let name = match args.get("name").and_then(|v| v.as_str()) {
+        Some(n) => n,
+        None => return Response::err("delete: missing 'name' arg"),
+    };
+    if name.is_empty() || name.contains("..") {
+        return Response::err("delete: invalid name");
+    }
+    let dir = std::path::Path::new("saves").join(name);
+    if !dir.is_dir() {
+        return Response::err(format!("delete: snapshot '{}' not found", name));
+    }
+    if let Err(e) = std::fs::remove_dir_all(&dir) {
+        return Response::err(format!("delete: {}: {}", dir.display(), e));
+    }
+    Response::ok()
+}
+
+fn cmd_serial_send(server: &CiServer, args: &Value) -> Response {
+    let data = match args.get("data").and_then(|v| v.as_str()) {
+        Some(s) => s,
+        None => return Response::err("serial-send: missing 'data' arg"),
+    };
+    server.ci_serial.push_host(data.as_bytes());
+    Response::ok()
+}
+
+fn cmd_serial_read(server: &CiServer) -> Response {
+    let bytes = server.ci_serial.drain_guest();
+    let s = String::from_utf8_lossy(&bytes).into_owned();
+    Response::data(Value::String(s))
+}
+
+fn cmd_screenshot(server: &CiServer, args: &Value) -> Response {
+    let Some(rex3) = &server.rex3 else {
+        return Response::err("screenshot: REX3 not present (running with --headless?)");
+    };
+    let Some(path) = args.get("path").and_then(|v| v.as_str()) else {
+        return Response::err("screenshot: missing 'path' arg");
+    };
+
+    // Snapshot the framebuffer under the screen lock; unlock before the PNG
+    // encode so the refresh thread isn't blocked during disk I/O.
+    let (width, height, rgba_copy) = {
+        let screen = rex3.screen.lock();
+        let w = screen.width;
+        let h = screen.height;
+        let mut out = Vec::with_capacity(w * h);
+        // `rgba` has row stride 2048; copy the visible window.
+        for y in 0..h {
+            let base = y * 2048;
+            out.extend_from_slice(&screen.rgba[base..base + w]);
+        }
+        (w, h, out)
+    };
+
+    // Encode each u32 0xFFRRGGBB as 3 RGB bytes in the order the PNG encoder
+    // expects.
+    let mut rgb = Vec::with_capacity(width * height * 3);
+    for px in &rgba_copy {
+        rgb.push(((px >> 16) & 0xff) as u8);
+        rgb.push(((px >> 8) & 0xff) as u8);
+        rgb.push((px & 0xff) as u8);
+    }
+
+    let file = match std::fs::File::create(path) {
+        Ok(f) => f,
+        Err(e) => return Response::err(format!("screenshot: create {}: {}", path, e)),
+    };
+    let bw = std::io::BufWriter::new(file);
+    let mut enc = png::Encoder::new(bw, width as u32, height as u32);
+    enc.set_color(png::ColorType::Rgb);
+    enc.set_depth(png::BitDepth::Eight);
+    let mut writer = match enc.write_header() {
+        Ok(w) => w,
+        Err(e) => return Response::err(format!("screenshot: png header: {}", e)),
+    };
+    if let Err(e) = writer.write_image_data(&rgb) {
+        return Response::err(format!("screenshot: png write: {}", e));
+    }
+
+    Response::data(serde_json::json!({
+        "path": path,
+        "width": width,
+        "height": height,
+        "bytes": rgb.len() + 100,  // rough
+    }))
+}
+
+fn cmd_wait_serial(server: &CiServer, args: &Value) -> Response {
+    let pattern = match args.get("pattern").and_then(|v| v.as_str()) {
+        Some(p) => p.to_string(),
+        None => return Response::err("wait-serial: missing 'pattern' arg"),
+    };
+    let timeout_ms = args.get("timeout_ms").and_then(|v| v.as_u64()).unwrap_or(10_000);
+
+    match server.ci_serial.wait_for(pattern.as_bytes(), Duration::from_millis(timeout_ms)) {
+        Some(consumed) => {
+            let s = String::from_utf8_lossy(&consumed).into_owned();
+            Response::data(Value::String(s))
+        }
+        None => Response::err(format!("wait-serial: timeout after {}ms waiting for {:?}", timeout_ms, pattern)),
+    }
+}
+
+// ----------------------------------------------------------------------------
+// Scratch volume (Phase 2.4): file injection / extraction without networking.
+//
+// The scratch device is a raw SCSI LUN (`scratch = true` in iris.toml).
+// iris pre-formats the underlying file with a minimal SGI Volume Header at
+// sector 0 so IRIX recognises it (without the VH, /dev/rdsk/dks0dNvol
+// returns I/O error on every read). The VH defines partition slot 7
+// ("vol") spanning sectors 8..end and slot 8 ("vh") spanning sectors 0..7.
+//
+// Wire convention:
+//   - `scratch-write` and `scratch-read` operate on the *payload* area —
+//     `offset = 0` means the first byte after the VH (raw byte 4096 in the
+//     underlying file). The VH is never touched by these commands.
+//   - The guest reads the same payload at offset 0 of /dev/rdsk/dks0dNvol
+//     because partition 7's first_block = 8.
+//   - Typical guest read: `dd if=/dev/rdsk/dks0d2vol bs=64k | tar xf -`.
+//
+// Each scratch op briefly stops the machine to quiesce in-flight SCSI I/O
+// (Machine::with_paused). The CPU is restarted only if it was running before
+// — a scratch-write issued before the harness `start`s the CPU does not
+// auto-start it.
+// ----------------------------------------------------------------------------
+
+use crate::sgi_vh::SCRATCH_PAYLOAD_OFFSET;
+
+/// Reject names that would escape the host or smuggle in shell metachars. The
+/// host-side path is read by serde_json so quoting is already handled, but a
+/// caller-supplied "../" can still escape an intended sandbox.
+fn validate_host_path(p: &str) -> Result<std::path::PathBuf, String> {
+    if p.is_empty() {
+        return Err("path: empty".into());
+    }
+    let pb = std::path::PathBuf::from(p);
+    if pb.components().any(|c| matches!(c, std::path::Component::ParentDir)) {
+        return Err(format!("path: '..' components not allowed in {:?}", p));
+    }
+    Ok(pb)
+}
+
+fn cmd_scratch_write(server: &CiServer, args: &Value) -> Response {
+    let host_path = match args.get("host_path").and_then(|v| v.as_str()) {
+        Some(p) => match validate_host_path(p) {
+            Ok(pb) => pb,
+            Err(e) => return Response::err(format!("scratch-write: {}", e)),
+        },
+        None => return Response::err("scratch-write: missing 'host_path' arg"),
+    };
+    let offset = args.get("offset").and_then(|v| v.as_u64()).unwrap_or(0);
+
+    let bytes = match std::fs::read(&host_path) {
+        Ok(b) => b,
+        Err(e) => return Response::err(format!("scratch-write: read {}: {}", host_path.display(), e)),
+    };
+
+    let result = server.with_machine(|m| {
+        let scratch = match m.scratch_path() {
+            Some(p) => p.to_path_buf(),
+            None => return Err("scratch volume not configured (set `scratch = true` on a SCSI device in iris.toml)".to_string()),
+        };
+        m.with_paused(|| -> Result<u64, String> {
+            use std::io::{Seek, SeekFrom, Write};
+            let mut f = std::fs::OpenOptions::new()
+                .write(true)
+                .open(&scratch)
+                .map_err(|e| format!("open {}: {}", scratch.display(), e))?;
+            // Skip the VH partition; offset is relative to the payload area.
+            let raw_offset = SCRATCH_PAYLOAD_OFFSET.checked_add(offset)
+                .ok_or_else(|| "offset overflow".to_string())?;
+            f.seek(SeekFrom::Start(raw_offset)).map_err(|e| format!("seek: {}", e))?;
+            f.write_all(&bytes).map_err(|e| format!("write: {}", e))?;
+            f.sync_all().map_err(|e| format!("fsync: {}", e))?;
+            Ok(bytes.len() as u64)
+        })
+    });
+
+    match result {
+        Ok(n) => Response::data(serde_json::json!({
+            "bytes_written": n,
+            "offset": offset,
+            "host_path": host_path.display().to_string(),
+        })),
+        Err(e) => Response::err(format!("scratch-write: {}", e)),
+    }
+}
+
+fn cmd_scratch_read(server: &CiServer, args: &Value) -> Response {
+    let to_path = match args.get("to_path").and_then(|v| v.as_str()) {
+        Some(p) => match validate_host_path(p) {
+            Ok(pb) => pb,
+            Err(e) => return Response::err(format!("scratch-read: {}", e)),
+        },
+        None => return Response::err("scratch-read: missing 'to_path' arg"),
+    };
+    let offset = args.get("offset").and_then(|v| v.as_u64()).unwrap_or(0);
+    let length = args.get("length").and_then(|v| v.as_u64());
+
+    let result = server.with_machine(|m| {
+        let scratch = match m.scratch_path() {
+            Some(p) => p.to_path_buf(),
+            None => return Err("scratch volume not configured (set `scratch = true` on a SCSI device in iris.toml)".to_string()),
+        };
+        m.with_paused(|| -> Result<u64, String> {
+            use std::io::{Read, Seek, SeekFrom};
+            let mut f = std::fs::File::open(&scratch)
+                .map_err(|e| format!("open {}: {}", scratch.display(), e))?;
+            let total = f.metadata().map(|m| m.len()).unwrap_or(0);
+            let payload_total = total.saturating_sub(SCRATCH_PAYLOAD_OFFSET);
+            let raw_offset = SCRATCH_PAYLOAD_OFFSET.checked_add(offset)
+                .ok_or_else(|| "offset overflow".to_string())?;
+            let len = match length {
+                Some(n) => n.min(payload_total.saturating_sub(offset)),
+                None => payload_total.saturating_sub(offset),
+            };
+            f.seek(SeekFrom::Start(raw_offset)).map_err(|e| format!("seek: {}", e))?;
+            let mut buf = vec![0u8; len as usize];
+            f.read_exact(&mut buf).map_err(|e| format!("read: {}", e))?;
+            std::fs::write(&to_path, &buf)
+                .map_err(|e| format!("write {}: {}", to_path.display(), e))?;
+            Ok(buf.len() as u64)
+        })
+    });
+
+    match result {
+        Ok(n) => Response::data(serde_json::json!({
+            "bytes_read": n,
+            "offset": offset,
+            "to_path": to_path.display().to_string(),
+        })),
+        Err(e) => Response::err(format!("scratch-read: {}", e)),
+    }
+}
+
+fn cmd_scratch_clear(server: &CiServer) -> Response {
+    let result = server.with_machine(|m| {
+        let scratch = match m.scratch_path() {
+            Some(p) => p.to_path_buf(),
+            None => return Err("scratch volume not configured".to_string()),
+        };
+        m.with_paused(|| -> Result<u64, String> {
+            use std::io::{Seek, SeekFrom, Write};
+            let mut f = std::fs::OpenOptions::new()
+                .write(true)
+                .open(&scratch)
+                .map_err(|e| format!("open {}: {}", scratch.display(), e))?;
+            let size = f.metadata().map(|m| m.len()).unwrap_or(0);
+            // Zero only the payload area (after the VH). Zero in 1 MiB chunks
+            // rather than allocating a buffer the full size of the volume.
+            let chunk = vec![0u8; 1024 * 1024];
+            f.seek(SeekFrom::Start(SCRATCH_PAYLOAD_OFFSET))
+                .map_err(|e| format!("seek: {}", e))?;
+            let mut remaining = size.saturating_sub(SCRATCH_PAYLOAD_OFFSET);
+            while remaining > 0 {
+                let n = remaining.min(chunk.len() as u64) as usize;
+                f.write_all(&chunk[..n]).map_err(|e| format!("write: {}", e))?;
+                remaining -= n as u64;
+            }
+            f.sync_all().map_err(|e| format!("fsync: {}", e))?;
+            Ok(size.saturating_sub(SCRATCH_PAYLOAD_OFFSET))
+        })
+    });
+
+    match result {
+        Ok(n) => Response::data(serde_json::json!({ "bytes_cleared": n })),
+        Err(e) => Response::err(format!("scratch-clear: {}", e)),
+    }
+}
+
+fn cmd_scratch_info(server: &CiServer) -> Response {
+    let path = server.with_machine(|m| m.scratch_path().map(|p| p.to_path_buf()));
+    let Some(path) = path else {
+        return Response::err("scratch-info: scratch volume not configured");
+    };
+    let size = std::fs::metadata(&path).map(|m| m.len()).unwrap_or(0);
+    Response::data(serde_json::json!({
+        "path": path.display().to_string(),
+        "size_bytes": size,
+        "payload_offset": SCRATCH_PAYLOAD_OFFSET,
+        "payload_size_bytes": size.saturating_sub(SCRATCH_PAYLOAD_OFFSET),
+    }))
+}
+
+// ----------------------------------------------------------------------------
+// Snapshot determinism validator (Phase 3.3)
+// ----------------------------------------------------------------------------
+
+fn cmd_validate(server: &CiServer, args: &Value) -> Response {
+    let name = match args.get("name").and_then(|v| v.as_str()) {
+        Some(n) => n.to_string(),
+        None => return Response::err("validate: missing 'name' arg"),
+    };
+    let n = args
+        .get("n_instructions")
+        .and_then(|v| v.as_u64())
+        .unwrap_or(1_000_000);
+
+    let report_result = server.with_machine(|m| {
+        crate::validate::validate_snapshot_determinism(m, &name, n)
+    });
+
+    match report_result {
+        Ok(report) => Response::data(serde_json::json!({
+            "deterministic": report.deterministic,
+            "instructions_run": report.instructions_run,
+            "summary": report.summary(),
+            "diffs": report.diffs.iter().map(|(f, a, b)| {
+                serde_json::json!({"field": f, "a": a, "b": b})
+            }).collect::<Vec<_>>(),
+            "pc": format!("0x{:016x}", report.state_a.pc),
+        })),
+        Err(e) => Response::err(format!("validate: {}", e)),
+    }
+}
+
+// ----------------------------------------------------------------------------
+// Snapshot library: gc / diff / tree (Phase 3.2)
+// ----------------------------------------------------------------------------
+
+/// Walk every snapshot directory under `saves/`, parse each `chunks.bin`, and
+/// collect the set of referenced chunk hashes. Used by `gc` to figure out
+/// which chunks are still live.
+fn collect_live_chunks() -> std::io::Result<std::collections::HashSet<crate::chunk_store::ChunkHash>> {
+    use std::collections::HashSet;
+    let mut live: HashSet<crate::chunk_store::ChunkHash> = HashSet::new();
+    let root = std::path::Path::new("saves");
+    if !root.is_dir() {
+        return Ok(live);
+    }
+    let mut stack: Vec<std::path::PathBuf> = vec![root.to_path_buf()];
+    while let Some(dir) = stack.pop() {
+        for e in std::fs::read_dir(&dir)?.flatten() {
+            let p = e.path();
+            if let Some(name) = p.file_name().and_then(|n| n.to_str()) {
+                if name == ".cas" { continue; }
+            }
+            if p.is_dir() {
+                stack.push(p);
+                continue;
+            }
+            if p.file_name().and_then(|n| n.to_str()) == Some("chunks.bin") {
+                if let Ok(bytes) = std::fs::read(&p) {
+                    if let Ok(m) = postcard::from_bytes::<crate::snapshot::ChunksManifest>(&bytes) {
+                        for h in m.referenced_hashes() {
+                            live.insert(*h);
+                        }
+                    }
+                }
+            }
+        }
+    }
+    Ok(live)
+}
+
+fn cmd_gc() -> Response {
+    let live = match collect_live_chunks() {
+        Ok(l) => l,
+        Err(e) => return Response::err(format!("gc: collect live: {}", e)),
+    };
+    let store = crate::chunk_store::ChunkStore::new("saves");
+    let total_before = store.total_size().unwrap_or(0);
+    match store.gc(&live) {
+        Ok((removed, bytes)) => {
+            // Drop now-empty shard dirs so saves/.cas stays tidy.
+            if let Ok(entries) = std::fs::read_dir(store.root()) {
+                for e in entries.flatten() {
+                    let p = e.path();
+                    if p.is_dir() {
+                        let empty = std::fs::read_dir(&p).map(|mut it| it.next().is_none()).unwrap_or(false);
+                        if empty {
+                            let _ = std::fs::remove_dir(&p);
+                        }
+                    }
+                }
+            }
+            Response::data(serde_json::json!({
+                "live_chunks": live.len(),
+                "removed_chunks": removed,
+                "bytes_freed": bytes,
+                "bytes_before": total_before,
+                "bytes_after": total_before.saturating_sub(bytes),
+            }))
+        }
+        Err(e) => Response::err(format!("gc: {}", e)),
+    }
+}
+
+/// Diff two snapshots: per-device state diffs, RAM chunk-level deltas, COW
+/// overlay sector deltas. Heavy lifting reuses BinValue's `PartialEq` (toml
+/// equality) for device state and ChunksManifest hashes for RAM/framebuffer
+/// regions.
+fn cmd_diff(args: &Value) -> Response {
+    let a = match args.get("a").and_then(|v| v.as_str()) {
+        Some(s) => s.to_string(),
+        None => return Response::err("diff: missing 'a' arg"),
+    };
+    let b = match args.get("b").and_then(|v| v.as_str()) {
+        Some(s) => s.to_string(),
+        None => return Response::err("diff: missing 'b' arg"),
+    };
+    if a.is_empty() || a.contains("..") || b.is_empty() || b.contains("..") {
+        return Response::err("diff: invalid name");
+    }
+
+    let dir_a = std::path::PathBuf::from("saves").join(&a);
+    let dir_b = std::path::PathBuf::from("saves").join(&b);
+    if !dir_a.is_dir() {
+        return Response::err(format!("diff: snapshot '{}' not found", a));
+    }
+    if !dir_b.is_dir() {
+        return Response::err(format!("diff: snapshot '{}' not found", b));
+    }
+
+    let snap_a = crate::snapshot::Snapshot::new(&dir_a);
+    let snap_b = crate::snapshot::Snapshot::new(&dir_b);
+
+    let sv_a = snap_a.read_manifest().ok().flatten().map(|m| m.schema_version).unwrap_or(0);
+    let sv_b = snap_b.read_manifest().ok().flatten().map(|m| m.schema_version).unwrap_or(0);
+
+    // Per-device state. The eight devices we track here are the ones every
+    // configuration writes; rex3 is optional so it's handled separately.
+    let device_bases = [
+        "cpu", "mc", "ioc", "scc", "pit", "ps2", "rtc",
+        "eeprom", "scsi", "seeq", "hpc3",
+    ];
+    let mut devices_changed: Vec<&'static str> = Vec::new();
+    let mut devices_unchanged: Vec<&'static str> = Vec::new();
+    for &base in &device_bases {
+        let va = snap_a.read_state(base, sv_a).ok();
+        let vb = snap_b.read_state(base, sv_b).ok();
+        match (va, vb) {
+            (Some(va), Some(vb)) => {
+                if va == vb { devices_unchanged.push(base); }
+                else        { devices_changed.push(base); }
+            }
+            _ => devices_changed.push(base),
+        }
+    }
+    // REX3 separately because it's optional.
+    let rex_a = snap_a.read_state("rex3", sv_a).ok();
+    let rex_b = snap_b.read_state("rex3", sv_b).ok();
+    let rex3_changed = match (rex_a, rex_b) {
+        (Some(va), Some(vb)) => Some(va != vb),
+        (None, None)         => None,
+        _                    => Some(true),
+    };
+
+    // RAM bank deltas via chunks.bin (v3+ only).
+    let mut bank_changed_chunks = [0u32; 4];
+    let mut bank_total_chunks   = [0u32; 4];
+    let mut framebuffer_changed_chunks: Option<(u32, u32)> = None;
+    if sv_a >= 3 && sv_b >= 3 {
+        if let (Ok(ma), Ok(mb)) = (snap_a.read_chunks_manifest(), snap_b.read_chunks_manifest()) {
+            for i in 0..4 {
+                let ah = &ma.bank_chunks[i];
+                let bh = &mb.bank_chunks[i];
+                let n = ah.len().max(bh.len());
+                bank_total_chunks[i] = n as u32;
+                let mut changed = 0u32;
+                for k in 0..n {
+                    let av = ah.get(k);
+                    let bv = bh.get(k);
+                    if av != bv { changed += 1; }
+                }
+                bank_changed_chunks[i] = changed;
+            }
+            if let (Some((rgb_a, aux_a)), Some((rgb_b, aux_b))) =
+                (&ma.framebuffer_chunks, &mb.framebuffer_chunks)
+            {
+                let n = rgb_a.len().max(rgb_b.len()) + aux_a.len().max(aux_b.len());
+                let mut changed = 0u32;
+                for k in 0..rgb_a.len().max(rgb_b.len()) {
+                    if rgb_a.get(k) != rgb_b.get(k) { changed += 1; }
+                }
+                for k in 0..aux_a.len().max(aux_b.len()) {
+                    if aux_a.get(k) != aux_b.get(k) { changed += 1; }
+                }
+                framebuffer_changed_chunks = Some((changed, n as u32));
+            }
+        }
+    }
+
+    // COW overlay sector deltas from cow.toml.
+    let cow_a = snap_a.read_toml("cow.toml").ok();
+    let cow_b = snap_b.read_toml("cow.toml").ok();
+    let mut cow_diff_per_id: Vec<(usize, u64, u64, u64)> = Vec::new(); // (id, only_a, only_b, both)
+    if let (Some(ca), Some(cb)) = (cow_a, cow_b) {
+        let mut ids: std::collections::BTreeSet<usize> = Default::default();
+        if let Some(t) = ca.as_table() {
+            for k in t.keys() {
+                if let Some(s) = k.strip_prefix("scsi") {
+                    if let Ok(n) = s.parse::<usize>() { ids.insert(n); }
+                }
+            }
+        }
+        if let Some(t) = cb.as_table() {
+            for k in t.keys() {
+                if let Some(s) = k.strip_prefix("scsi") {
+                    if let Ok(n) = s.parse::<usize>() { ids.insert(n); }
+                }
+            }
+        }
+        for id in ids {
+            let key = format!("scsi{}", id);
+            let set_a: std::collections::HashSet<u64> = ca.get(&key)
+                .and_then(|v| v.as_array())
+                .map(|arr| arr.iter().filter_map(|x| x.as_integer().map(|i| i as u64)).collect())
+                .unwrap_or_default();
+            let set_b: std::collections::HashSet<u64> = cb.get(&key)
+                .and_then(|v| v.as_array())
+                .map(|arr| arr.iter().filter_map(|x| x.as_integer().map(|i| i as u64)).collect())
+                .unwrap_or_default();
+            let only_a = set_a.difference(&set_b).count() as u64;
+            let only_b = set_b.difference(&set_a).count() as u64;
+            let both = set_a.intersection(&set_b).count() as u64;
+            cow_diff_per_id.push((id, only_a, only_b, both));
+        }
+    }
+
+    Response::data(serde_json::json!({
+        "a": a,
+        "b": b,
+        "schema_a": sv_a,
+        "schema_b": sv_b,
+        "devices_changed": devices_changed,
+        "devices_unchanged": devices_unchanged,
+        "rex3_changed": rex3_changed,
+        "bank_changed_chunks": bank_changed_chunks,
+        "bank_total_chunks":   bank_total_chunks,
+        "framebuffer_changed_chunks": framebuffer_changed_chunks,
+        "cow_diff": cow_diff_per_id.into_iter().map(|(id, only_a, only_b, both)| {
+            serde_json::json!({"scsi_id": id, "only_a": only_a, "only_b": only_b, "both": both})
+        }).collect::<Vec<_>>(),
+    }))
+}
+
+/// Walk every snapshot under `saves/`, build a parent → children map, render
+/// indented tree text. Snapshots without a parent (or with a parent that
+/// doesn't exist locally) hang off a synthetic `(none)` root.
+fn cmd_tree() -> Response {
+    use std::collections::BTreeMap;
+    let root = std::path::Path::new("saves");
+    if !root.is_dir() {
+        return Response::data(serde_json::json!({"tree": "(no saves directory)"}));
+    }
+
+    // (name, parent) for each snapshot.
+    let mut entries: Vec<(String, Option<String>)> = Vec::new();
+    let mut stack: Vec<std::path::PathBuf> = vec![root.to_path_buf()];
+    while let Some(dir) = stack.pop() {
+        let Ok(it) = std::fs::read_dir(&dir) else { continue };
+        let mut subdirs: Vec<std::path::PathBuf> = Vec::new();
+        let mut found_manifest = false;
+        let mut found_legacy_cpu = false;
+        for e in it.flatten() {
+            let p = e.path();
+            if let Some(n) = p.file_name().and_then(|n| n.to_str()) {
+                if n == ".cas" { continue; }
+            }
+            if p.is_dir() {
+                subdirs.push(p);
+            } else if let Some(name) = p.file_name().and_then(|n| n.to_str()) {
+                if name == "snapshot.toml" { found_manifest = true; }
+                if name == "cpu.toml" { found_legacy_cpu = true; }
+            }
+        }
+        if found_manifest || found_legacy_cpu {
+            if let Ok(rel) = dir.strip_prefix(root) {
+                let display_name = rel.to_string_lossy().replace('\\', "/");
+                if !display_name.is_empty() {
+                    let snap = crate::snapshot::Snapshot::new(&dir);
+                    let parent = snap.read_manifest().ok().flatten().and_then(|m| m.parent);
+                    entries.push((display_name, parent));
+                }
+            }
+        }
+        for s in subdirs { stack.push(s); }
+    }
+
+    // Build parent → children map (None parent → top-level).
+    let mut by_parent: BTreeMap<Option<String>, Vec<String>> = BTreeMap::new();
+    let names: std::collections::HashSet<String> = entries.iter().map(|(n, _)| n.clone()).collect();
+    for (name, parent) in &entries {
+        let key = match parent {
+            Some(p) if names.contains(p) => Some(p.clone()),
+            _ => None,
+        };
+        by_parent.entry(key).or_default().push(name.clone());
+    }
+    for v in by_parent.values_mut() { v.sort(); }
+
+    fn render(out: &mut String, by_parent: &BTreeMap<Option<String>, Vec<String>>, parent: Option<&str>, depth: usize) {
+        let key = parent.map(String::from);
+        if let Some(children) = by_parent.get(&key) {
+            for child in children {
+                for _ in 0..depth { out.push_str("  "); }
+                out.push_str("- ");
+                out.push_str(child);
+                out.push('\n');
+                render(out, by_parent, Some(child), depth + 1);
+            }
+        }
+    }
+    let mut text = String::new();
+    render(&mut text, &by_parent, None, 0);
+    if text.is_empty() { text.push_str("(no snapshots)\n"); }
+
+    Response::data(serde_json::json!({
+        "snapshots": entries.iter().map(|(n, p)| {
+            serde_json::json!({"name": n, "parent": p})
+        }).collect::<Vec<_>>(),
+        "tree": text.trim_end_matches('\n').to_string(),
+    }))
+}
+
+// ----------------------------------------------------------------------------
+// HTTP snapshot registry (Phase 3.4)
+// ----------------------------------------------------------------------------
+
+fn cmd_pull(args: &Value) -> Response {
+    let url = match args.get("url").and_then(|v| v.as_str()) {
+        Some(s) => s.to_string(),
+        None => return Response::err("pull: missing 'url' arg"),
+    };
+    let name = match args.get("name").and_then(|v| v.as_str()) {
+        Some(s) => s.to_string(),
+        None => return Response::err("pull: missing 'name' arg"),
+    };
+    let saves = std::path::PathBuf::from("saves");
+    if !saves.is_dir() {
+        if let Err(e) = std::fs::create_dir_all(&saves) {
+            return Response::err(format!("pull: create saves/: {}", e));
+        }
+    }
+    match crate::registry::pull(&url, &name, &saves) {
+        Ok(report) => Response::data(serde_json::json!({
+            "name": name,
+            "url": url,
+            "chunks_fetched": report.chunks_fetched,
+            "chunks_skipped": report.chunks_skipped,
+            "files_transferred": report.files_transferred,
+            "bytes_transferred": report.bytes_transferred,
+        })),
+        Err(e) => Response::err(format!("pull: {}", e)),
+    }
+}
+
+fn cmd_push(args: &Value) -> Response {
+    let url = match args.get("url").and_then(|v| v.as_str()) {
+        Some(s) => s.to_string(),
+        None => return Response::err("push: missing 'url' arg"),
+    };
+    let name = match args.get("name").and_then(|v| v.as_str()) {
+        Some(s) => s.to_string(),
+        None => return Response::err("push: missing 'name' arg"),
+    };
+    match crate::registry::push(&url, &name, std::path::Path::new("saves")) {
+        Ok(report) => Response::data(serde_json::json!({
+            "name": name,
+            "url": url,
+            "chunks_uploaded": report.chunks_fetched,
+            "chunks_skipped": report.chunks_skipped,
+            "files_transferred": report.files_transferred,
+            "bytes_transferred": report.bytes_transferred,
+        })),
+        Err(e) => Response::err(format!("push: {}", e)),
+    }
+}
diff --git a/src/config.rs b/src/config.rs
index f775f80..09e9994 100644
--- a/src/config.rs
+++ b/src/config.rs
@@ -18,6 +18,19 @@ pub struct ScsiDeviceConfig {
     /// `{path}.overlay`. Delete the overlay file to reset to clean state.
     #[serde(default)]
     pub overlay: bool,
+    /// Scratch volume: a host-controlled raw block device used for file
+    /// injection/extraction without networking. iris auto-creates a zero-filled
+    /// file at `path` if it doesn't exist (size = `size_mb`, default 64). The
+    /// CI socket exposes scratch-write/read/clear/info to mutate it from the
+    /// host side. No filesystem is imposed: callers can write a tar stream and
+    /// the guest reads it with `dd if=/dev/rdsk/dks0dNvh | tar xf -`.
+    /// Implies !cdrom && !overlay (the volume must be host-writable directly).
+    #[serde(default)]
+    pub scratch: bool,
+    /// Size in MB for an auto-created scratch volume. Ignored when the file
+    /// already exists or `scratch=false`.
+    #[serde(default)]
+    pub size_mb: Option<u32>,
 }
 
 /// Protocol for port forwarding.
@@ -115,8 +128,24 @@ pub struct MachineConfig {
     /// If Some(port), start the GDB RSP stub on that TCP port.
     #[serde(default)]
     pub gdb_port: Option<u16>,
+
+    /// CI mode: opens a control socket for automation, applies speed-favoring
+    /// fidelity shortcuts. Implies headless unless ci_display is also set.
+    #[serde(default)]
+    pub ci: bool,
+
+    /// Unix socket path for CI control. Used only when `ci` is true.
+    #[serde(default = "default_ci_socket")]
+    pub ci_socket: String,
+
+    /// With `ci`, keep the Newport window visible (deferred rendering) for
+    /// interactive test development.
+    #[serde(default)]
+    pub ci_display: bool,
 }
 
+fn default_ci_socket() -> String { "/tmp/iris.sock".to_string() }
+
 fn default_prom() -> String {
     "prom.bin".to_string()
 }
@@ -134,12 +163,16 @@ fn default_scsi() -> std::collections::HashMap<u8, ScsiDeviceConfig> {
         discs: vec![],
         cdrom: false,
         overlay: false,
+        scratch: false,
+        size_mb: None,
     });
     map.insert(4, ScsiDeviceConfig {
         path: "cdrom4.iso".to_string(),
         discs: vec![],
         cdrom: true,
         overlay: false,
+        scratch: false,
+        size_mb: None,
     });
     map
 }
@@ -156,6 +189,9 @@ impl Default for MachineConfig {
             headless: false,
             no_audio: false,
             gdb_port: None,
+            ci: false,
+            ci_socket: default_ci_socket(),
+            ci_display: false,
         }
     }
 }
@@ -178,8 +214,8 @@ impl MachineConfig {
 
     /// Validate bank sizes, returns a description of any errors.
     pub fn validate(&self) -> Result<(), String> {
-        if self.scale != 1 && self.scale != 2 {
-            return Err(format!("scale {} is invalid (valid: 1, 2)", self.scale));
+        if self.scale < 1 || self.scale > 4 {
+            return Err(format!("scale {} is invalid (valid: 1, 2, 3, 4)", self.scale));
         }
         for (i, &sz) in self.banks.iter().enumerate() {
             if !VALID_BANK_SIZES.contains(&sz) {
@@ -310,6 +346,20 @@ pub struct Cli {
     /// Connect with: target remote localhost:<port>
     #[arg(long = "gdb-port", value_name = "PORT")]
     pub gdb_port: Option<u16>,
+
+    /// CI mode: enable the control socket and apply speed-favoring fidelity
+    /// shortcuts. Implies --headless unless --ci-display is also set.
+    #[arg(long, default_value_t = false)]
+    pub ci: bool,
+
+    /// Override the default control-socket path (/tmp/iris.sock).
+    #[arg(long = "ci-socket", value_name = "PATH")]
+    pub ci_socket: Option<String>,
+
+    /// With --ci, keep the Newport window visible for interactive test
+    /// development (deferred rendering at 10–15 fps).
+    #[arg(long = "ci-display", default_value_t = false)]
+    pub ci_display: bool,
 }
 
 impl Cli {
@@ -329,6 +379,8 @@ impl Cli {
                 discs: vec![],
                 cdrom,
                 overlay: false,
+                scratch: false,
+                size_mb: None,
             });
             entry.path = path;
             entry.cdrom = cdrom;
@@ -349,6 +401,12 @@ impl Cli {
         if self.headless  { cfg.headless  = true; }
         if self.no_audio  { cfg.no_audio  = true; }
 
+        if self.ci         { cfg.ci         = true; }
+        if let Some(p) = &self.ci_socket { cfg.ci_socket = p.clone(); }
+        if self.ci_display { cfg.ci_display = true; }
+        // NB: --ci does NOT imply --headless. REX3 stays alive so screenshots
+        // work; main.rs simply skips the host window when ci && !ci_display.
+
         // NFS: --nfs-dir enables NFS; other flags refine an existing [nfs] section or the defaults.
         if let Some(dir) = &self.nfs_dir {
             let base = cfg.nfs.get_or_insert_with(|| NfsConfig {
diff --git a/src/cow_disk.rs b/src/cow_disk.rs
index 1a2e89c..16cdd59 100644
--- a/src/cow_disk.rs
+++ b/src/cow_disk.rs
@@ -7,9 +7,95 @@
 use std::collections::HashSet;
 use std::fs::{File, OpenOptions};
 use std::io::{self, Read, Seek, SeekFrom, Write};
+use std::path::{Path, PathBuf};
 
 const SECTOR_SIZE: u64 = 512;
 
+/// Clone `src` to `dst` via filesystem-level CoW (APFS clonefile, Linux
+/// FICLONE) when supported; fall back to a regular byte copy otherwise. On a
+/// reflink-capable filesystem this is metadata-only — sub-millisecond for any
+/// size — which makes per-snapshot overlay capture essentially free.
+fn reflink_or_copy(src: &Path, dst: &Path) -> io::Result<()> {
+    let _ = std::fs::remove_file(dst);
+    if try_reflink(src, dst).is_ok() {
+        return Ok(());
+    }
+    std::fs::copy(src, dst).map(|_| ())
+}
+
+#[cfg(target_os = "macos")]
+fn try_reflink(src: &Path, dst: &Path) -> io::Result<()> {
+    use std::ffi::CString;
+    use std::os::unix::ffi::OsStrExt;
+    let src_c = CString::new(src.as_os_str().as_bytes())
+        .map_err(|e| io::Error::new(io::ErrorKind::InvalidInput, e))?;
+    let dst_c = CString::new(dst.as_os_str().as_bytes())
+        .map_err(|e| io::Error::new(io::ErrorKind::InvalidInput, e))?;
+    let rc = unsafe { libc::clonefile(src_c.as_ptr(), dst_c.as_ptr(), 0) };
+    if rc == 0 { Ok(()) } else { Err(io::Error::last_os_error()) }
+}
+
+#[cfg(target_os = "linux")]
+fn try_reflink(src: &Path, dst: &Path) -> io::Result<()> {
+    use std::os::unix::io::AsRawFd;
+    // FICLONE = _IOW(0x94, 9, int); see linux/fs.h.
+    const FICLONE: libc::c_ulong = 0x40049409;
+    let src_f = File::open(src)?;
+    let dst_f = OpenOptions::new().write(true).create(true).truncate(true).open(dst)?;
+    let rc = unsafe { libc::ioctl(dst_f.as_raw_fd(), FICLONE, src_f.as_raw_fd()) };
+    if rc == 0 {
+        Ok(())
+    } else {
+        let err = io::Error::last_os_error();
+        let _ = std::fs::remove_file(dst);
+        Err(err)
+    }
+}
+
+#[cfg(not(any(target_os = "macos", target_os = "linux")))]
+fn try_reflink(_src: &Path, _dst: &Path) -> io::Result<()> {
+    Err(io::Error::new(io::ErrorKind::Unsupported, "reflink not supported on this OS"))
+}
+
+/// Sidecar file holding the dirty sector list. Written next to the overlay
+/// (e.g. `foo.overlay.dirty`). Format: binary, `u64` little-endian count
+/// followed by that many `u64` sector LBAs, also LE. Compact enough that
+/// flushing it on shutdown or on a periodic schedule is cheap.
+fn dirty_sidecar_path(overlay_path: &str) -> PathBuf {
+    PathBuf::from(format!("{}.dirty", overlay_path))
+}
+
+fn load_dirty_sidecar(path: &Path) -> io::Result<HashSet<u64>> {
+    if !path.exists() { return Ok(HashSet::new()); }
+    let mut f = File::open(path)?;
+    let mut count_buf = [0u8; 8];
+    if f.read_exact(&mut count_buf).is_err() { return Ok(HashSet::new()); }
+    let count = u64::from_le_bytes(count_buf) as usize;
+    let mut set = HashSet::with_capacity(count);
+    let mut buf = [0u8; 8];
+    for _ in 0..count {
+        if f.read_exact(&mut buf).is_err() { break; }
+        set.insert(u64::from_le_bytes(buf));
+    }
+    Ok(set)
+}
+
+fn save_dirty_sidecar(path: &Path, dirty: &HashSet<u64>) -> io::Result<()> {
+    // Write atomically: write to a temp file then rename.
+    let tmp = path.with_extension("dirty.tmp");
+    {
+        let mut f = File::create(&tmp)?;
+        let count = dirty.len() as u64;
+        f.write_all(&count.to_le_bytes())?;
+        for &s in dirty {
+            f.write_all(&s.to_le_bytes())?;
+        }
+        f.sync_all()?;
+    }
+    std::fs::rename(&tmp, path)?;
+    Ok(())
+}
+
 pub struct CowDisk {
     base: File,
     overlay: File,
@@ -32,16 +118,23 @@ impl CowDisk {
             .create(true)
             .open(overlay_path)?;
 
-        // Rebuild the dirty set from the overlay file size.
-        // The overlay is a sparse file with the same layout as the base.
-        // Any sector that has been written occupies space, but we can't easily
-        // detect sparse holes portably. Instead, track dirty sectors in memory
-        // and accept that a fresh start after crash loses the dirty set
-        // (overlay is deleted on state load anyway).
-        let dirty = HashSet::new();
+        // Recover the dirty set from our sidecar file (written on flush /
+        // shutdown by previous runs). If the sidecar is missing we start
+        // empty — any prior writes in the overlay file are effectively
+        // invisible until the sidecar gets written. This is deliberate:
+        // "dirty" means "the host finished writing this sector," not "the
+        // file has some bytes here" (sparse allocation can contain partial
+        // writes from an interrupted run, which can't be trusted).
+        let sidecar = dirty_sidecar_path(overlay_path);
+        let dirty = load_dirty_sidecar(&sidecar).unwrap_or_default();
 
-        eprintln!("iris: COW overlay active (base: {}, overlay: {})", base_path, overlay_path);
-        eprintln!("iris: to reset disk to clean state, delete {}", overlay_path);
+        eprintln!("iris: COW overlay active (base: {}, overlay: {}, dirty sectors: {})",
+                  base_path, overlay_path, dirty.len());
+        if dirty.is_empty() && std::fs::metadata(overlay_path).map(|m| m.len()).unwrap_or(0) > 0 {
+            eprintln!("iris: note: overlay file has data but no .dirty sidecar — prior writes are not in use");
+        }
+        eprintln!("iris: to reset disk to clean state, delete {} and {}",
+                  overlay_path, sidecar.display());
 
         Ok(Self {
             base,
@@ -152,11 +245,107 @@ impl CowDisk {
         self.dirty.clear();
         self.overlay.set_len(0)?;
         self.overlay.seek(SeekFrom::Start(0))?;
+        // Also clear the sidecar so we don't "remember" sectors that no
+        // longer exist after the truncation.
+        let _ = std::fs::remove_file(dirty_sidecar_path(&self.overlay_path));
         Ok(())
     }
 
+    /// Flush the overlay file's data and persist the dirty sector set to
+    /// the sidecar. Call this on clean shutdown or before snapshot save so
+    /// a subsequent run can read back what we wrote.
+    pub fn flush(&mut self) -> io::Result<()> {
+        self.overlay.sync_all()?;
+        save_dirty_sidecar(&dirty_sidecar_path(&self.overlay_path), &self.dirty)
+    }
+
     /// Number of dirty sectors in the overlay.
     pub fn dirty_count(&self) -> usize {
         self.dirty.len()
     }
+
+    /// Copy the current overlay file to `dest` and return the dirty sector
+    /// list (sorted, ascending). Used by snapshot save so the entire disk
+    /// state — base + overlay — is captured consistently with RAM.
+    pub fn export_overlay(&mut self, dest: &Path) -> io::Result<Vec<u64>> {
+        self.overlay.sync_all()?;
+        reflink_or_copy(Path::new(&self.overlay_path), dest)?;
+        let mut dirty: Vec<u64> = self.dirty.iter().copied().collect();
+        dirty.sort_unstable();
+        Ok(dirty)
+    }
+
+    /// Replace the overlay contents with `source` and adopt `dirty` as the
+    /// dirty sector set. Used by snapshot load. If `source` doesn't exist
+    /// the overlay is truncated instead (matches `reset_overlay` behavior —
+    /// handles old snapshots without overlay data).
+    pub fn import_overlay(&mut self, source: &Path, dirty: Vec<u64>) -> io::Result<()> {
+        if source.exists() {
+            reflink_or_copy(source, Path::new(&self.overlay_path))?;
+        } else {
+            // Clear the overlay: nothing saved for this device.
+            std::fs::File::create(&self.overlay_path)?;
+        }
+        // Reopen the file handle — the previous File object points at the
+        // old inode (which std::fs::copy replaced on some platforms).
+        self.overlay = OpenOptions::new()
+            .read(true)
+            .write(true)
+            .create(true)
+            .open(&self.overlay_path)?;
+        self.dirty = dirty.into_iter().collect();
+        Ok(())
+    }
+}
+
+impl Drop for CowDisk {
+    fn drop(&mut self) {
+        if let Err(e) = self.flush() {
+            eprintln!("iris: COW flush on drop failed for {}: {} (writes may be lost)",
+                      self.overlay_path, e);
+        }
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use std::io::Write as _;
+
+    fn unique_tmp(tag: &str, ext: &str) -> PathBuf {
+        let nanos = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .map(|d| d.as_nanos())
+            .unwrap_or(0);
+        std::env::temp_dir().join(format!("iris-cow-{}-{}.{}", tag, nanos, ext))
+    }
+
+    #[test]
+    fn reflink_or_copy_preserves_bytes() {
+        let src = unique_tmp("reflink-src", "bin");
+        let dst = unique_tmp("reflink-dst", "bin");
+        let payload: Vec<u8> = (0u8..=255).cycle().take(64 * 1024 + 17).collect();
+        {
+            let mut f = File::create(&src).unwrap();
+            f.write_all(&payload).unwrap();
+            f.sync_all().unwrap();
+        }
+        reflink_or_copy(&src, &dst).expect("reflink_or_copy");
+        let read_back = std::fs::read(&dst).unwrap();
+        assert_eq!(read_back, payload);
+        let _ = std::fs::remove_file(&src);
+        let _ = std::fs::remove_file(&dst);
+    }
+
+    #[test]
+    fn reflink_or_copy_overwrites_existing_dst() {
+        let src = unique_tmp("reflink-src2", "bin");
+        let dst = unique_tmp("reflink-dst2", "bin");
+        std::fs::write(&src, b"new content").unwrap();
+        std::fs::write(&dst, b"old content that is longer than the new one").unwrap();
+        reflink_or_copy(&src, &dst).expect("overwrite");
+        assert_eq!(std::fs::read(&dst).unwrap(), b"new content");
+        let _ = std::fs::remove_file(&src);
+        let _ = std::fs::remove_file(&dst);
+    }
 }
diff --git a/src/ds1x86.rs b/src/ds1x86.rs
index 0acd115..c22f619 100644
--- a/src/ds1x86.rs
+++ b/src/ds1x86.rs
@@ -394,3 +394,39 @@ impl Saveable for Ds1x86 {
         Ok(())
     }
 }
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    /// Phase 1.7 round-trip: a fresh RTC loaded from a captured save_state must
+    /// re-serialize byte-identically. Save_state flushes the live host clock
+    /// into regs when TE is set, so we clear TE first to make the test stable.
+    #[test]
+    fn save_load_round_trip() {
+        let src = Ds1x86::new(8192);
+        // Disable transfer-enable so save_state doesn't tick the clock between
+        // calls; mutate a few NVRAM bytes outside the time-keeping registers.
+        {
+            let mut d = src.data.lock();
+            d.regs[CMD_REG_OFFSET] &= !TE_BIT;
+            d.regs[64]   = 0xa5;
+            d.regs[1024] = 0x5a;
+            d.regs[8190] = 0xff;
+        }
+        let v1 = src.save_state();
+
+        let dst = Ds1x86::new(8192);
+        dst.load_state(&v1).expect("load_state");
+        // Same: clear TE on dst before re-serializing so its save_state path
+        // matches src's behavior. (load_state preserves the TE bit from v1, so
+        // it should already be cleared, but be defensive.)
+        {
+            let mut d = dst.data.lock();
+            d.regs[CMD_REG_OFFSET] &= !TE_BIT;
+        }
+        let v2 = dst.save_state();
+
+        assert_eq!(v1, v2, "Ds1x86 save_state mismatch after load_state round-trip");
+    }
+}
diff --git a/src/eeprom_93c56.rs b/src/eeprom_93c56.rs
index 4aa4ad0..011b701 100644
--- a/src/eeprom_93c56.rs
+++ b/src/eeprom_93c56.rs
@@ -387,4 +387,25 @@ mod tests {
 
         assert_eq!(data, 0xABCD);
     }
+
+    /// Phase 1.7 round-trip: a fresh Eeprom loaded from a captured save_state
+    /// must re-serialize byte-identically. Catches load_state_mut forgetting a
+    /// field that save_state writes.
+    #[test]
+    fn save_load_round_trip() {
+        // Mutate a few words so we're not testing the all-default 0xFFFF value.
+        let mut src = Eeprom93c56::new();
+        src.write_enable = true;
+        src.data[0]   = 0xdead;
+        src.data[42]  = 0xbeef;
+        src.data[64]  = 0x1234;
+        src.data[127] = 0xcafe;
+        let v1 = src.save_state_owned();
+
+        let mut dst = Eeprom93c56::new();
+        dst.load_state_mut(&v1).expect("load_state_mut");
+        let v2 = dst.save_state_owned();
+
+        assert_eq!(v1, v2, "EEPROM save_state mismatch after load_state round-trip");
+    }
 }
\ No newline at end of file
diff --git a/src/hpc3.rs b/src/hpc3.rs
index 5e8bc69..f1b4f5c 100644
--- a/src/hpc3.rs
+++ b/src/hpc3.rs
@@ -1067,7 +1067,15 @@ impl Hpc3 {
     }
 
     pub fn add_scsi_device(&self, id: usize, path: &str, is_cdrom: bool, discs: Vec<String>, overlay: bool) -> std::io::Result<()> {
-        self.scsi_dev.add_device(id, path, is_cdrom, discs, overlay)
+        self.scsi_dev.add_device(id, path, is_cdrom, discs, overlay, None)
+    }
+
+    /// Same as `add_scsi_device` but lets the caller specify where the COW
+    /// overlay file lives. Used by `--ci` mode to keep per-process overlays
+    /// in `/tmp` so parallel `--ci` instances (and an interactive session)
+    /// don't race on the same file.
+    pub fn add_scsi_device_with_overlay(&self, id: usize, path: &str, is_cdrom: bool, discs: Vec<String>, overlay: bool, overlay_path: &str) -> std::io::Result<()> {
+        self.scsi_dev.add_device(id, path, is_cdrom, discs, overlay, Some(overlay_path))
     }
 
     pub fn ioc(&self) -> &Ioc {
diff --git a/src/ioc.rs b/src/ioc.rs
index a906901..6d24a00 100644
--- a/src/ioc.rs
+++ b/src/ioc.rs
@@ -199,6 +199,18 @@ pub struct Ioc {
 
 impl Ioc {
     pub fn new(guinness: bool) -> Self {
+        Self::new_inner(guinness, false)
+    }
+
+    /// CI-mode constructor: skips TCP serial backend binding on SCC channels
+    /// so multiple instances can run in parallel without port conflicts.
+    /// Caller must install backends via `scc().set_backend_{a,b}` before the
+    /// first `start()`.
+    pub fn new_ci(guinness: bool) -> Self {
+        Self::new_inner(guinness, true)
+    }
+
+    fn new_inner(guinness: bool, ci_mode: bool) -> Self {
         let sys_id = if guinness { 0x26 } else { 0x11 }; // primarily prom looks at bit 1 to detect full house.
         let state = Arc::new(Mutex::new(IocState {
             sys_id,
@@ -241,9 +253,15 @@ impl Ioc {
             source: IocInterrupt::KbMouse,
         });
 
+        let scc = if ci_mode {
+            Z85c30::new_null(Some(serial_irq))
+        } else {
+            Z85c30::new(Some(serial_irq))
+        };
+
         Self {
             state,
-            scc: Z85c30::new(Some(serial_irq)),
+            scc,
             pit: Pit8254::new(1_000_000, Some(timer0_cb), Some(timer1_cb), None),
             ps2: Arc::new(Ps2Controller::new(Some(ps2_cb))),
             guinness,
@@ -673,3 +691,41 @@ impl Saveable for Ioc {
         Ok(())
     }
 }
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    /// Phase 1.7 round-trip: a fresh IOC loaded from a captured save_state must
+    /// re-serialize byte-identically. Catches load_state forgetting any of the
+    /// 16 register fields that save_state writes.
+    #[test]
+    fn save_load_round_trip() {
+        // new_ci uses null serial backends — avoids TCP port binding under
+        // concurrent test runs.
+        let src = Ioc::new_ci(true);
+        {
+            let mut s = src.state.lock();
+            s.l0_stat   = 0x12; s.l0_mask  = 0x34;
+            s.l1_stat   = 0x56; s.l1_mask  = 0x78;
+            s.map_stat  = 0x9a; s.map_mask0 = 0xbc; s.map_mask1 = 0xde;
+            s.map_pol   = 0xf0; s.err_stat = 0x01;
+            s.gc_select = 0x0f; s.gen_cntl = 0xa5; s.panel = 0x5a;
+            s.read_reg  = 0xff; s.dma_sel  = 0x33;
+            s.reset_reg = 0x77; s.write_reg = 0xee;
+            // load_state re-runs update_interrupts, so the saved snapshot must
+            // already reflect the cascade-derived bits (MAP_INT0/MAP_INT1) for
+            // v1 to round-trip cleanly. In a real save these are always
+            // up-to-date because the bus driver runs update_interrupts on
+            // every register write.
+            s.update_interrupts();
+        }
+        let v1 = src.save_state();
+
+        let dst = Ioc::new_ci(true);
+        dst.load_state(&v1).expect("load_state");
+        let v2 = dst.save_state();
+
+        assert_eq!(v1, v2, "Ioc save_state mismatch after load_state round-trip");
+    }
+}
diff --git a/src/iris_ci_main.rs b/src/iris_ci_main.rs
new file mode 100644
index 0000000..6720e05
--- /dev/null
+++ b/src/iris_ci_main.rs
@@ -0,0 +1,906 @@
+//! `iris-ci` — ergonomic wrapper around the iris CI control socket.
+//!
+//! Replaces the raw `printf '...' | nc -U /tmp/iris.sock` pattern that's
+//! awkward to type, brittle to quote, and tedious to compose. Every
+//! socket-level operation gets a typed clap subcommand, plus macros for
+//! the recurring multi-step rituals (boot, login, run, put/get).
+//!
+//! The headline ergonomic wins:
+//! - `iris-ci boot` does the full PROM-menu-to-login dance in one command.
+//! - `iris-ci run "ls /tmp"` sends the command, waits for the prompt, and
+//!   returns just the captured stdout + exit status.
+//! - `iris-ci put localfile.tar` copies a host file into the guest with
+//!   the right `dd bs=512 count=N` recipe baked in — no foot-gun.
+//! - `iris-ci script tests/scenario.iris` runs a sequence of commands
+//!   and prints per-step status + duration.
+//!
+//! Returns 0 on success, 1 on socket error, 2 on iris error response,
+//! 3 on local error (file not found etc).
+
+use clap::{Parser, Subcommand};
+use serde_json::{json, Value};
+use std::io::{BufRead, BufReader, Write};
+use std::net::Shutdown;
+use std::os::unix::net::UnixStream;
+use std::path::PathBuf;
+use std::time::{Duration, Instant};
+
+const DEFAULT_SOCKET: &str = "/tmp/iris.sock";
+const PROMPT_RE: &str = "IRIS"; // Match "IRIS N# " — N is a counter that increments
+const RC_MARKER: &str = "IRIS-CI-RC=";
+
+#[derive(Parser, Debug)]
+#[command(
+    name = "iris-ci",
+    about = "Drive the iris CI control socket without raw nc + JSON.",
+    version
+)]
+struct Cli {
+    /// Path to the iris CI Unix socket. Override with $IRIS_SOCKET.
+    #[arg(long, global = true)]
+    socket: Option<PathBuf>,
+
+    /// Print raw JSON responses instead of pretty output.
+    #[arg(long, global = true)]
+    json: bool,
+
+    /// Be silent on success (for use in scripts).
+    #[arg(long, short = 'q', global = true)]
+    quiet: bool,
+
+    #[command(subcommand)]
+    cmd: Cmd,
+}
+
+#[derive(Subcommand, Debug)]
+enum Cmd {
+    /// Liveness check — returns "ok" if the socket is reachable.
+    Ping,
+    /// Start the CPU thread (no-op if already running).
+    Start,
+    /// Cleanly shut down iris.
+    Quit,
+
+    /// Save the current machine state to saves/<name>/.
+    Save {
+        name: String,
+        #[arg(long)]
+        description: Option<String>,
+    },
+    /// Disk-backed full restore (~145 ms cold, sets up rollback checkpoint).
+    Restore { name: String },
+    /// In-memory rewind to the last `restore` checkpoint (~40 ms).
+    Rollback,
+    /// List all saved snapshots.
+    List,
+    /// Show metadata (manifest, schema_version, size) for one snapshot.
+    Info { name: String },
+    /// Delete a snapshot (does NOT free CAS chunks; run `gc` after).
+    Delete { name: String },
+    /// Render the snapshot parent-chain tree.
+    Tree,
+    /// Compare two snapshots: device + RAM-chunk + COW-sector deltas.
+    Diff { a: String, b: String },
+    /// Sweep CAS chunks not referenced by any kept snapshot.
+    Gc,
+    /// Run the snapshot determinism validator.
+    Validate {
+        name: String,
+        /// Number of instructions to step in each pass (default 1_000_000).
+        #[arg(short = 'n', long, default_value_t = 1_000_000)]
+        n: u64,
+    },
+    /// Save the REX3 framebuffer to a PNG.
+    Screenshot { path: PathBuf },
+
+    /// Send keystrokes to the IRIX serial console.
+    SerialSend {
+        text: String,
+        /// Don't append \r to the text.
+        #[arg(long)]
+        no_cr: bool,
+    },
+    /// Drain the serial output buffer and print it.
+    SerialRead,
+    /// Wait until `pattern` appears in serial output (or timeout).
+    SerialWait {
+        pattern: String,
+        #[arg(long, default_value_t = 30)]
+        timeout: u64,
+    },
+
+    /// Boot from PROM menu through to the IRIS console login prompt.
+    Boot {
+        /// Total timeout in seconds for boot to reach the login prompt.
+        #[arg(long, default_value_t = 240)]
+        timeout: u64,
+    },
+    /// Send root login + dismiss the vt100 prompt + wait for the shell.
+    Login {
+        #[arg(default_value = "root")]
+        user: String,
+        /// Optional password (most IRIX root accounts have none).
+        #[arg(long)]
+        password: Option<String>,
+    },
+    /// Send a shell command, wait for the prompt, return stdout + exit code.
+    Run {
+        command: String,
+        /// Guest shell. csh uses $status; sh uses $?.
+        #[arg(long, default_value = "csh")]
+        shell: String,
+        #[arg(long, default_value_t = 60)]
+        timeout: u64,
+    },
+    /// Drain output and wait for the next shell prompt.
+    WaitPrompt {
+        #[arg(long, default_value_t = 30)]
+        timeout: u64,
+    },
+
+    /// Copy a host file into the guest. Handles bs=512 + count automatically.
+    Put {
+        host_path: PathBuf,
+        /// Where to put it inside IRIX. Defaults to /tmp/<basename>.
+        #[arg(long)]
+        to: Option<String>,
+        #[arg(long, default_value_t = 120)]
+        timeout: u64,
+    },
+    /// Pull a guest file out to the host. Handles tar + scratch round-trip.
+    Get {
+        guest_path: String,
+        /// Where to write on the host. Defaults to ./<basename>.
+        #[arg(long)]
+        to: Option<PathBuf>,
+        #[arg(long, default_value_t = 120)]
+        timeout: u64,
+    },
+
+    /// Raw scratch volume operations (bypass guest interaction).
+    #[command(subcommand)]
+    Scratch(ScratchCmd),
+
+    /// Pull a snapshot from a remote registry (e.g. `http://localhost:8765`).
+    Pull { url: String, name: String },
+    /// Push a snapshot to a remote registry.
+    Push { url: String, name: String },
+
+    /// Run a sequence of iris-ci commands from a file (one per line, # comments).
+    Script { path: PathBuf },
+}
+
+#[derive(Subcommand, Debug)]
+enum ScratchCmd {
+    /// Copy raw bytes from a host file into the scratch payload area.
+    Write {
+        path: PathBuf,
+        #[arg(long, default_value_t = 0)]
+        offset: u64,
+    },
+    /// Copy raw bytes from the scratch payload area into a host file.
+    Read {
+        path: PathBuf,
+        #[arg(long, default_value_t = 0)]
+        offset: u64,
+        #[arg(long)]
+        length: Option<u64>,
+    },
+    /// Zero the scratch payload area (preserves the SGI VH at sector 0).
+    Clear,
+    /// Show scratch volume size + payload offset.
+    Info,
+}
+
+// ---- main / dispatch ---------------------------------------------------------
+
+fn main() {
+    let cli = Cli::parse();
+    let socket = cli
+        .socket
+        .clone()
+        .or_else(|| std::env::var_os("IRIS_SOCKET").map(PathBuf::from))
+        .unwrap_or_else(|| PathBuf::from(DEFAULT_SOCKET));
+    let opts = Opts {
+        socket,
+        json: cli.json,
+        quiet: cli.quiet,
+    };
+
+    let exit = match dispatch(&opts, cli.cmd) {
+        Ok(()) => 0,
+        Err(Error::Local(e)) => {
+            eprintln!("iris-ci: {}", e);
+            3
+        }
+        Err(Error::Connection(e)) => {
+            eprintln!("iris-ci: connect {}: {}", opts.socket.display(), e);
+            1
+        }
+        Err(Error::Iris(e)) => {
+            eprintln!("iris-ci: iris error: {}", e);
+            2
+        }
+    };
+    std::process::exit(exit);
+}
+
+fn dispatch(opts: &Opts, cmd: Cmd) -> Result<()> {
+    match cmd {
+        Cmd::Ping        => simple(opts, "ping", json!({}), "ok"),
+        Cmd::Start       => simple(opts, "start", json!({}), "started"),
+        Cmd::Quit        => simple(opts, "quit", json!({}), "quit"),
+        Cmd::Save { name, description } => {
+            let mut args = json!({"name": name});
+            if let Some(d) = description { args["description"] = Value::String(d); }
+            simple(opts, "save", args, &format!("saved: {}", "")) // detailed status logged below
+        }
+        Cmd::Restore  { name } => simple(opts, "restore",  json!({"name": name}), "restored"),
+        Cmd::Rollback        => simple(opts, "rollback", json!({}),                "rolled back"),
+        Cmd::List            => cmd_list(opts),
+        Cmd::Info     { name }     => cmd_info(opts, &name),
+        Cmd::Delete   { name }     => simple(opts, "delete",  json!({"name": name}), "deleted"),
+        Cmd::Tree                  => cmd_tree(opts),
+        Cmd::Diff     { a, b }     => cmd_diff(opts, &a, &b),
+        Cmd::Gc                    => cmd_gc(opts),
+        Cmd::Validate { name, n }  => cmd_validate(opts, &name, n),
+        Cmd::Screenshot { path }   => simple(opts, "screenshot", json!({"path": path.display().to_string()}), "screenshot"),
+
+        Cmd::SerialSend { text, no_cr } => {
+            let data = if no_cr { text } else { format!("{}\r", text) };
+            simple(opts, "serial-send", json!({"data": data}), "sent")
+        }
+        Cmd::SerialRead => cmd_serial_read(opts),
+        Cmd::SerialWait { pattern, timeout } => cmd_serial_wait(opts, &pattern, timeout * 1000),
+
+        Cmd::Boot   { timeout } => cmd_boot(opts, timeout),
+        Cmd::Login  { user, password } => cmd_login(opts, &user, password.as_deref()),
+        Cmd::Run    { command, shell, timeout } => cmd_run(opts, &command, &shell, timeout * 1000),
+        Cmd::WaitPrompt { timeout } => cmd_wait_prompt(opts, timeout * 1000),
+
+        Cmd::Put { host_path, to, timeout } => cmd_put(opts, &host_path, to.as_deref(), timeout * 1000),
+        Cmd::Get { guest_path, to, timeout } => cmd_get(opts, &guest_path, to.as_deref(), timeout * 1000),
+
+        Cmd::Scratch(s) => cmd_scratch(opts, s),
+
+        Cmd::Pull { url, name } => cmd_pull(opts, &url, &name),
+        Cmd::Push { url, name } => cmd_push(opts, &url, &name),
+
+        Cmd::Script { path } => cmd_script(opts, &path),
+    }
+}
+
+// ---- error type --------------------------------------------------------------
+
+#[derive(Debug)]
+enum Error {
+    Local(String),
+    Connection(std::io::Error),
+    Iris(String),
+}
+type Result<T> = std::result::Result<T, Error>;
+impl From<std::io::Error> for Error {
+    fn from(e: std::io::Error) -> Self { Error::Connection(e) }
+}
+
+struct Opts {
+    socket: PathBuf,
+    json: bool,
+    quiet: bool,
+}
+
+// ---- socket client -----------------------------------------------------------
+
+/// Send one JSON command, return the parsed response data on `ok:true`,
+/// or an Error on connection failure or `ok:false`.
+///
+/// Protocol detail: the server (`src/ci.rs::handle_client`) keeps the
+/// connection open and reads requests in a loop, expecting the client to
+/// close. We send our single request, then read exactly one newline-
+/// terminated response line, then drop the stream — the server's reader
+/// loop sees EOF and exits cleanly.
+fn send(opts: &Opts, cmd: &str, args: Value) -> Result<Value> {
+    let s = UnixStream::connect(&opts.socket)?;
+    s.set_read_timeout(Some(Duration::from_secs(300))).ok();
+    let req = json!({"cmd": cmd, "args": args});
+    let line = format!("{}\n", serde_json::to_string(&req).expect("json"));
+    {
+        let mut writer = s.try_clone()?;
+        writer.write_all(line.as_bytes())?;
+        writer.flush()?;
+    }
+    // Read exactly one line of response.
+    let mut reader = BufReader::new(s.try_clone()?);
+    let mut buf = String::new();
+    reader.read_line(&mut buf)?;
+    // Tell the server we're done so its read loop exits.
+    let _ = s.shutdown(Shutdown::Both);
+    let trimmed = buf.trim();
+    if trimmed.is_empty() {
+        return Err(Error::Iris("empty response".into()));
+    }
+    let resp: Value = serde_json::from_str(trimmed).map_err(|e| {
+        Error::Iris(format!("bad response: {}: {}", e, trimmed))
+    })?;
+    if resp.get("ok").and_then(|v| v.as_bool()) != Some(true) {
+        let msg = resp
+            .get("error")
+            .and_then(|v| v.as_str())
+            .unwrap_or("unknown error");
+        return Err(Error::Iris(format!("{}: {}", cmd, msg)));
+    }
+    Ok(resp.get("data").cloned().unwrap_or(Value::Null))
+}
+
+// ---- 1:1 commands with pretty output -----------------------------------------
+
+fn simple(opts: &Opts, cmd: &str, args: Value, ok_msg: &str) -> Result<()> {
+    let data = send(opts, cmd, args)?;
+    print_response(opts, ok_msg, &data);
+    Ok(())
+}
+
+fn print_response(opts: &Opts, ok_msg: &str, data: &Value) {
+    if opts.json {
+        println!("{}", serde_json::to_string_pretty(data).unwrap_or_else(|_| data.to_string()));
+        return;
+    }
+    if opts.quiet { return; }
+    if data.is_null() || (data.is_object() && data.as_object().map(|m| m.is_empty()).unwrap_or(true)) {
+        println!("{}", ok_msg);
+    } else {
+        println!("{}: {}", ok_msg, data);
+    }
+}
+
+fn cmd_list(opts: &Opts) -> Result<()> {
+    let data = send(opts, "list", json!({}))?;
+    if opts.json { println!("{}", serde_json::to_string_pretty(&data).unwrap_or_default()); return Ok(()); }
+    if let Some(arr) = data.get("snapshots").and_then(|v| v.as_array()) {
+        for s in arr {
+            if let Some(name) = s.as_str() {
+                println!("{}", name);
+            }
+        }
+    }
+    Ok(())
+}
+
+fn cmd_info(opts: &Opts, name: &str) -> Result<()> {
+    let data = send(opts, "info", json!({"name": name}))?;
+    if opts.json { println!("{}", serde_json::to_string_pretty(&data).unwrap_or_default()); return Ok(()); }
+    let f = |k: &str| data.get(k).cloned().unwrap_or(Value::Null);
+    println!("name             {}", f("name"));
+    println!("schema_version   {}", f("schema_version"));
+    println!("host_arch        {}", f("host_arch"));
+    println!("created_at_unix  {}", f("created_at_unix"));
+    println!("bytes_on_disk    {}", f("bytes_on_disk"));
+    if let Some(p) = data.get("parent") { if !p.is_null() { println!("parent           {}", p); } }
+    if let Some(d) = data.get("description") { if !d.is_null() { println!("description      {}", d); } }
+    if let Some(b) = data.get("installed_bundles") { println!("installed        {}", b); }
+    Ok(())
+}
+
+fn cmd_tree(opts: &Opts) -> Result<()> {
+    let data = send(opts, "tree", json!({}))?;
+    if opts.json { println!("{}", serde_json::to_string_pretty(&data).unwrap_or_default()); return Ok(()); }
+    if let Some(t) = data.get("tree").and_then(|v| v.as_str()) {
+        println!("{}", t);
+    }
+    Ok(())
+}
+
+fn cmd_diff(opts: &Opts, a: &str, b: &str) -> Result<()> {
+    let data = send(opts, "diff", json!({"a": a, "b": b}))?;
+    if opts.json { println!("{}", serde_json::to_string_pretty(&data).unwrap_or_default()); return Ok(()); }
+    println!("diff {} → {}", a, b);
+    if let Some(arr) = data.get("devices_changed").and_then(|v| v.as_array()) {
+        let names: Vec<String> = arr.iter().filter_map(|v| v.as_str().map(str::to_string)).collect();
+        if !names.is_empty() {
+            println!("  devices changed:   {}", names.join(", "));
+        }
+    }
+    if let Some(arr) = data.get("devices_unchanged").and_then(|v| v.as_array()) {
+        let names: Vec<String> = arr.iter().filter_map(|v| v.as_str().map(str::to_string)).collect();
+        if !names.is_empty() {
+            println!("  devices unchanged: {}", names.join(", "));
+        }
+    }
+    if let (Some(c), Some(t)) = (
+        data.get("bank_changed_chunks").and_then(|v| v.as_array()),
+        data.get("bank_total_chunks").and_then(|v| v.as_array()),
+    ) {
+        for i in 0..c.len().min(t.len()) {
+            let c = c[i].as_u64().unwrap_or(0);
+            let t = t[i].as_u64().unwrap_or(0);
+            if t > 0 {
+                println!("  bank{}: {}/{} chunks changed", i, c, t);
+            }
+        }
+    }
+    if let Some(arr) = data.get("cow_diff").and_then(|v| v.as_array()) {
+        for entry in arr {
+            let id = entry.get("scsi_id").and_then(|v| v.as_u64()).unwrap_or(0);
+            let only_a = entry.get("only_a").and_then(|v| v.as_u64()).unwrap_or(0);
+            let only_b = entry.get("only_b").and_then(|v| v.as_u64()).unwrap_or(0);
+            let both = entry.get("both").and_then(|v| v.as_u64()).unwrap_or(0);
+            println!("  scsi{}: only-a={} only-b={} both={}", id, only_a, only_b, both);
+        }
+    }
+    Ok(())
+}
+
+fn cmd_gc(opts: &Opts) -> Result<()> {
+    let data = send(opts, "gc", json!({}))?;
+    if opts.json { println!("{}", serde_json::to_string_pretty(&data).unwrap_or_default()); return Ok(()); }
+    let removed = data.get("removed_chunks").and_then(|v| v.as_u64()).unwrap_or(0);
+    let bytes = data.get("bytes_freed").and_then(|v| v.as_u64()).unwrap_or(0);
+    let live = data.get("live_chunks").and_then(|v| v.as_u64()).unwrap_or(0);
+    println!("gc: {} chunks removed, {} bytes freed, {} live", removed, bytes, live);
+    Ok(())
+}
+
+fn cmd_validate(opts: &Opts, name: &str, n: u64) -> Result<()> {
+    let data = send(opts, "validate", json!({"name": name, "n_instructions": n}))?;
+    if opts.json { println!("{}", serde_json::to_string_pretty(&data).unwrap_or_default()); return Ok(()); }
+    if let Some(s) = data.get("summary").and_then(|v| v.as_str()) {
+        println!("{}", s);
+    }
+    if data.get("deterministic").and_then(|v| v.as_bool()) != Some(true) {
+        // Validation surfaced a real divergence — exit with iris-error code so
+        // scripts can branch on it.
+        return Err(Error::Iris("non-deterministic".into()));
+    }
+    Ok(())
+}
+
+// ---- serial helpers ----------------------------------------------------------
+
+fn cmd_serial_read(opts: &Opts) -> Result<()> {
+    let data = send(opts, "serial-read", json!({}))?;
+    if let Some(s) = data.as_str() {
+        if !s.is_empty() {
+            // Re-render \r\n cleanly — IRIX uses CRLF on the wire.
+            print!("{}", s.replace("\r\n", "\n").replace('\r', "\n"));
+        }
+    }
+    Ok(())
+}
+
+fn cmd_serial_wait(opts: &Opts, pattern: &str, timeout_ms: u64) -> Result<()> {
+    let data = send(opts, "wait-serial", json!({"pattern": pattern, "timeout_ms": timeout_ms}))?;
+    if let Some(s) = data.as_str() {
+        if !opts.quiet {
+            print!("{}", s.replace("\r\n", "\n").replace('\r', "\n"));
+        }
+    }
+    Ok(())
+}
+
+// ---- boot/login/run macros --------------------------------------------------
+
+fn cmd_boot(opts: &Opts, timeout_s: u64) -> Result<()> {
+    let deadline = Instant::now() + Duration::from_secs(timeout_s);
+    if !opts.quiet { eprintln!("boot: starting CPU"); }
+    send(opts, "start", json!({}))?;
+    if !opts.quiet { eprintln!("boot: waiting for PROM menu"); }
+    wait_with_deadline(opts, "Option?", deadline)?;
+    if !opts.quiet { eprintln!("boot: PROM reached, selecting 1) Start System"); }
+    send(opts, "serial-send", json!({"data": "1\r"}))?;
+    if !opts.quiet { eprintln!("boot: waiting for kernel boot to login prompt"); }
+    wait_with_deadline(opts, "IRIS console login", deadline)?;
+    if !opts.quiet { eprintln!("boot: ready at login"); }
+    Ok(())
+}
+
+fn cmd_login(opts: &Opts, user: &str, password: Option<&str>) -> Result<()> {
+    send(opts, "serial-send", json!({"data": format!("{}\r", user)}))?;
+    // IRIX presents `TERM = (vt100)` after the username; pressing enter accepts.
+    std::thread::sleep(Duration::from_millis(2000));
+    if let Some(p) = password {
+        send(opts, "wait-serial", json!({"pattern": "Password:", "timeout_ms": 5000}))?;
+        send(opts, "serial-send", json!({"data": format!("{}\r", p)}))?;
+    }
+    send(opts, "serial-send", json!({"data": "\r"}))?;
+    let deadline = Instant::now() + Duration::from_secs(15);
+    wait_with_deadline(opts, "#", deadline)?;
+    if !opts.quiet { eprintln!("login: shell ready"); }
+    Ok(())
+}
+
+fn cmd_wait_prompt(opts: &Opts, timeout_ms: u64) -> Result<()> {
+    let deadline = Instant::now() + Duration::from_millis(timeout_ms);
+    wait_with_deadline(opts, PROMPT_RE, deadline)?;
+    Ok(())
+}
+
+/// Run a command and return captured stdout + exit code. Internal helper
+/// shared by `cmd_run` (which prints stdout) and `cmd_get` (which parses it).
+fn run_capture(opts: &Opts, command: &str, shell: &str, timeout_ms: u64) -> Result<(String, i32)> {
+    let rc_var = match shell {
+        "csh" | "tcsh" => "$status",
+        "sh" | "bash" | "ksh" => "$?",
+        other => return Err(Error::Local(format!("unknown shell {}", other))),
+    };
+    // Drain anything stale before sending.
+    let _ = send(opts, "serial-read", json!({}))?;
+    let line = format!("{}; echo {}{}\r", command, RC_MARKER, rc_var);
+    send(opts, "serial-send", json!({"data": line}))?;
+    // Single wait: pattern `\nIRIS-CI-RC=` only matches at the start of the
+    // output line (the typed-input echo line has `IRIS-CI-RC=$status` inline,
+    // so it has no preceding newline immediately before the marker).
+    let pat = format!("\n{}", RC_MARKER);
+    let captured = send(
+        opts,
+        "wait-serial",
+        json!({"pattern": pat, "timeout_ms": timeout_ms}),
+    )?;
+    let raw = captured.as_str().unwrap_or("").to_string();
+    // Drain trailing chars (rc digits + next prompt).
+    std::thread::sleep(Duration::from_millis(150));
+    let trailing = send(opts, "serial-read", json!({}))?;
+    let trailing_s = trailing.as_str().unwrap_or("");
+    let rc = parse_rc(&format!("{}{}", RC_MARKER, trailing_s)).unwrap_or(-1);
+    let stdout = extract_run_stdout(&raw);
+    Ok((stdout, rc))
+}
+
+/// Send a command, wait for a sentinel, print stdout, fail on non-zero exit.
+/// csh: appends `; echo IRIS-CI-RC=$status`. sh: appends `; echo IRIS-CI-RC=$?`.
+fn cmd_run(opts: &Opts, command: &str, shell: &str, timeout_ms: u64) -> Result<()> {
+    let (stdout, rc) = run_capture(opts, command, shell, timeout_ms)?;
+    if !stdout.is_empty() {
+        println!("{}", stdout);
+    }
+    if rc != 0 {
+        return Err(Error::Iris(format!("guest exit {}", rc)));
+    }
+    Ok(())
+}
+
+/// `wait-serial` for `\nIRIS-CI-RC=` returns bytes shaped like:
+///
+///   <typed-echo-line>\r\n<stdout>\r?\n<pattern>
+///
+/// Skip the first newline (end of the typed echo), strip the trailing
+/// pattern + its leading newline, normalise CRLF, return.
+fn extract_run_stdout(buf: &str) -> String {
+    // Drop the typed-echo-line (everything up through and including its
+    // first \n).
+    let after_echo = match buf.find('\n') {
+        Some(i) => &buf[i + 1..],
+        None => buf,
+    };
+    // Drop the trailing `\nIRIS-CI-RC=` (we waited for `\nIRIS-CI-RC=`).
+    let trimmed = match after_echo.rfind(RC_MARKER) {
+        Some(p) => &after_echo[..p],
+        None => after_echo,
+    };
+    trimmed
+        .trim_end_matches(['\r', '\n'])
+        .replace("\r\n", "\n")
+        .replace('\r', "\n")
+}
+
+/// Pull the digits after IRIS-CI-RC= out of a buffer.
+fn parse_rc(buf: &str) -> Option<i32> {
+    let pos = buf.rfind(RC_MARKER)?;
+    let tail = &buf[pos + RC_MARKER.len()..];
+    let digits: String = tail.chars().take_while(|c| c.is_ascii_digit() || *c == '-').collect();
+    digits.parse().ok()
+}
+
+
+fn wait_with_deadline(opts: &Opts, pattern: &str, deadline: Instant) -> Result<()> {
+    let now = Instant::now();
+    if now >= deadline {
+        return Err(Error::Iris(format!("wait {}: deadline already passed", pattern)));
+    }
+    let timeout_ms = (deadline - now).as_millis() as u64;
+    send(opts, "wait-serial", json!({"pattern": pattern, "timeout_ms": timeout_ms}))
+        .map(|_| ())
+}
+
+// ---- put / get (the bs=512 foot-gun killers) --------------------------------
+
+fn cmd_put(opts: &Opts, host_path: &std::path::Path, to: Option<&str>, timeout_ms: u64) -> Result<()> {
+    let bytes = std::fs::read(host_path)
+        .map_err(|e| Error::Local(format!("read {}: {}", host_path.display(), e)))?;
+    let basename = host_path
+        .file_name()
+        .and_then(|s| s.to_str())
+        .unwrap_or("inject.bin");
+    let guest_path = to
+        .map(String::from)
+        .unwrap_or_else(|| format!("/tmp/{}", basename));
+
+    // 1. Write to scratch volume payload area at offset 0.
+    let scratch_payload = send(opts, "scratch-info", json!({}))?;
+    let payload_size = scratch_payload
+        .get("payload_size_bytes")
+        .and_then(|v| v.as_u64())
+        .ok_or_else(|| Error::Iris("scratch-info: no payload_size_bytes".into()))?;
+    if (bytes.len() as u64) > payload_size {
+        return Err(Error::Local(format!(
+            "{} bytes too large for {} byte scratch payload",
+            bytes.len(),
+            payload_size
+        )));
+    }
+    let host_path_for_socket = host_path.canonicalize()
+        .unwrap_or_else(|_| host_path.to_path_buf());
+    send(
+        opts,
+        "scratch-write",
+        json!({"host_path": host_path_for_socket.display().to_string()}),
+    )?;
+    if !opts.quiet {
+        eprintln!("put: {} bytes staged in scratch", bytes.len());
+    }
+
+    // 2. Drive the guest to read exactly the right number of 512-byte sectors.
+    //    Use `>&` for combined stderr+stdout (csh syntax — `2>&1` is sh-only).
+    //    cmd_run wraps with `; echo IRIS-CI-RC=$status` itself.
+    let sectors = (bytes.len() as u64).div_ceil(512);
+    let dd_cmd = format!(
+        "dd if=/dev/rdsk/dks0d2s0 of={} bs=512 count={} >& /dev/null",
+        guest_path, sectors
+    );
+    cmd_run(opts, &dd_cmd, "csh", timeout_ms)?;
+
+    // 3. Truncate the guest file to the original byte length (dd reads in
+    //    sector multiples, so a 28-byte input becomes 512 bytes on the guest).
+    //    `dd of=FILE bs=1 seek=N count=0` is POSIX and IRIX-clean.
+    let dd_trunc = format!(
+        "dd if=/dev/null of={} bs=1 seek={} count=0 >& /dev/null",
+        guest_path,
+        bytes.len()
+    );
+    cmd_run(opts, &dd_trunc, "csh", 10_000)?;
+
+    if !opts.quiet {
+        eprintln!("put: {} → {} ({} bytes)", host_path.display(), guest_path, bytes.len());
+    }
+    Ok(())
+}
+
+fn cmd_get(opts: &Opts, guest_path: &str, to: Option<&std::path::Path>, timeout_ms: u64) -> Result<()> {
+    let host_path: PathBuf = match to {
+        Some(p) => p.to_path_buf(),
+        None => {
+            let basename = guest_path
+                .rsplit('/')
+                .next()
+                .filter(|s| !s.is_empty())
+                .unwrap_or("captured.bin");
+            PathBuf::from(basename)
+        }
+    };
+
+    // 1. Zero scratch payload so trailing zeros after the file are unambiguous.
+    send(opts, "scratch-clear", json!({}))?;
+
+    // 2. Drive the guest to write the file to scratch with conv=sync padding.
+    //    csh redirect syntax: `>&` for stdout+stderr. cmd_run adds the
+    //    rc-marker echo itself.
+    let dd_cmd = format!(
+        "dd if={} of=/dev/rdsk/dks0d2s0 bs=512 conv=sync,notrunc >& /dev/null",
+        guest_path
+    );
+    cmd_run(opts, &dd_cmd, "csh", timeout_ms)?;
+
+    // 3. Look up the guest file size so we know how much to slice off the
+    //    scratch payload (which is now padded to a 512-byte boundary). Use
+    //    a pure-shell approach: `wc -c` outputs just the byte count.
+    //    `awk` is also available but `wc -c` is simpler to parse.
+    let stat_cmd = format!("wc -c < {}", guest_path);
+    let (stat_stdout, stat_rc) = run_capture(opts, &stat_cmd, "csh", 10_000)?;
+    if stat_rc != 0 {
+        return Err(Error::Iris(format!(
+            "guest stat of {} failed (exit {})", guest_path, stat_rc
+        )));
+    }
+    let size_bytes = stat_stdout
+        .lines()
+        .filter_map(|l| l.trim().parse::<u64>().ok())
+        .next()
+        .ok_or_else(|| Error::Iris(format!(
+            "couldn't parse byte count from `wc -c < {}`: {:?}",
+            guest_path, stat_stdout
+        )))?;
+
+    // 5. Read the exact number of bytes back from scratch.
+    let host_abs = std::path::absolute(&host_path).unwrap_or_else(|_| host_path.clone());
+    send(
+        opts,
+        "scratch-read",
+        json!({
+            "to_path": host_abs.display().to_string(),
+            "length": size_bytes,
+            "offset": 0,
+        }),
+    )?;
+    if !opts.quiet {
+        eprintln!(
+            "get: {} ({} bytes) → {}",
+            guest_path,
+            size_bytes,
+            host_path.display()
+        );
+    }
+    Ok(())
+}
+
+// ---- scratch raw -------------------------------------------------------------
+
+fn cmd_scratch(opts: &Opts, s: ScratchCmd) -> Result<()> {
+    match s {
+        ScratchCmd::Write { path, offset } => {
+            let abs = path
+                .canonicalize()
+                .map_err(|e| Error::Local(format!("{}: {}", path.display(), e)))?;
+            simple(
+                opts,
+                "scratch-write",
+                json!({"host_path": abs.display().to_string(), "offset": offset}),
+                "wrote",
+            )
+        }
+        ScratchCmd::Read { path, offset, length } => {
+            let abs = std::path::absolute(&path).unwrap_or(path);
+            let mut args = json!({"to_path": abs.display().to_string(), "offset": offset});
+            if let Some(n) = length {
+                args["length"] = json!(n);
+            }
+            simple(opts, "scratch-read", args, "read")
+        }
+        ScratchCmd::Clear => simple(opts, "scratch-clear", json!({}), "cleared"),
+        ScratchCmd::Info => {
+            let data = send(opts, "scratch-info", json!({}))?;
+            if opts.json {
+                println!("{}", serde_json::to_string_pretty(&data).unwrap_or_default());
+            } else {
+                println!("path                {}", data.get("path").cloned().unwrap_or(Value::Null));
+                println!("size_bytes          {}", data.get("size_bytes").cloned().unwrap_or(Value::Null));
+                println!("payload_offset      {}", data.get("payload_offset").cloned().unwrap_or(Value::Null));
+                println!("payload_size_bytes  {}", data.get("payload_size_bytes").cloned().unwrap_or(Value::Null));
+            }
+            Ok(())
+        }
+    }
+}
+
+// ---- pull / push -------------------------------------------------------------
+
+fn cmd_pull(opts: &Opts, url: &str, name: &str) -> Result<()> {
+    let data = send(opts, "pull", json!({"url": url, "name": name}))?;
+    if opts.json { println!("{}", serde_json::to_string_pretty(&data).unwrap_or_default()); return Ok(()); }
+    let f = |k: &str| data.get(k).and_then(|v| v.as_u64()).unwrap_or(0);
+    println!(
+        "pull {}: {} chunks fetched, {} skipped, {} files, {} bytes",
+        name,
+        f("chunks_fetched"),
+        f("chunks_skipped"),
+        f("files_transferred"),
+        f("bytes_transferred"),
+    );
+    Ok(())
+}
+
+fn cmd_push(opts: &Opts, url: &str, name: &str) -> Result<()> {
+    let data = send(opts, "push", json!({"url": url, "name": name}))?;
+    if opts.json { println!("{}", serde_json::to_string_pretty(&data).unwrap_or_default()); return Ok(()); }
+    let f = |k: &str| data.get(k).and_then(|v| v.as_u64()).unwrap_or(0);
+    println!(
+        "push {}: {} chunks uploaded, {} skipped, {} files, {} bytes",
+        name,
+        f("chunks_uploaded"),
+        f("chunks_skipped"),
+        f("files_transferred"),
+        f("bytes_transferred"),
+    );
+    Ok(())
+}
+
+// ---- script file mode --------------------------------------------------------
+
+/// Parse a script line into argv tokens. Supports double-quoted strings with
+/// `\"` and `\\` escapes — same surface as a typical shell so users can
+/// write `run "echo hello"` without bash being involved.
+fn tokenize(line: &str) -> std::result::Result<Vec<String>, String> {
+    let mut out = Vec::new();
+    let mut cur = String::new();
+    let mut in_quote = false;
+    let mut escape = false;
+    let mut started = false;
+    for c in line.chars() {
+        if escape {
+            cur.push(c);
+            escape = false;
+            continue;
+        }
+        if in_quote {
+            match c {
+                '\\' => escape = true,
+                '"' => {
+                    in_quote = false;
+                    out.push(std::mem::take(&mut cur));
+                    started = false;
+                }
+                _ => cur.push(c),
+            }
+            continue;
+        }
+        match c {
+            '"' => { in_quote = true; started = true; }
+            ' ' | '\t' => {
+                if started { out.push(std::mem::take(&mut cur)); started = false; }
+            }
+            _ => { cur.push(c); started = true; }
+        }
+    }
+    if in_quote { return Err("unterminated quote".into()); }
+    if started { out.push(cur); }
+    Ok(out)
+}
+
+fn cmd_script(opts: &Opts, path: &std::path::Path) -> Result<()> {
+    let text = std::fs::read_to_string(path)
+        .map_err(|e| Error::Local(format!("read {}: {}", path.display(), e)))?;
+    let mut overall_failed = false;
+    for (lineno, raw) in text.lines().enumerate() {
+        let line = raw.trim();
+        if line.is_empty() || line.starts_with('#') {
+            continue;
+        }
+        let tokens = tokenize(line).map_err(|e| Error::Local(format!("line {}: {}", lineno + 1, e)))?;
+        if tokens.is_empty() {
+            continue;
+        }
+
+        // Re-parse via clap to dispatch.
+        let mut argv = vec!["iris-ci".to_string()];
+        argv.extend(tokens.iter().cloned());
+        let cli = match Cli::try_parse_from(&argv) {
+            Ok(c) => c,
+            Err(e) => {
+                eprintln!("[line {}] parse error: {}", lineno + 1, e);
+                overall_failed = true;
+                break;
+            }
+        };
+
+        // Inherit our --socket / --json / --quiet from the outer invocation.
+        let sub_opts = Opts {
+            socket: opts.socket.clone(),
+            json: opts.json || cli.json,
+            quiet: opts.quiet || cli.quiet,
+        };
+
+        let pretty = format_step(line);
+        let t = Instant::now();
+        let res = dispatch(&sub_opts, cli.cmd);
+        let elapsed = t.elapsed();
+        match res {
+            Ok(()) => {
+                if !opts.quiet {
+                    println!("[ok {:>6.0?}] {}", elapsed, pretty);
+                }
+            }
+            Err(e) => {
+                eprintln!("[FAIL {:>6.0?}] {}: {:?}", elapsed, pretty, e);
+                overall_failed = true;
+                break;
+            }
+        }
+    }
+    if overall_failed {
+        return Err(Error::Local("script aborted on error".into()));
+    }
+    Ok(())
+}
+
+/// Truncate long script lines for display.
+fn format_step(line: &str) -> String {
+    if line.len() <= 72 { line.to_string() } else { format!("{}…", &line[..71]) }
+}
diff --git a/src/lib.rs b/src/lib.rs
index 47fe9f6..210d77c 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -43,6 +43,11 @@ pub mod disp;
 pub mod exp;
 pub mod gdb_stub;
 pub mod snapshot;
+pub mod sgi_vh;
+pub mod chunk_store;
+pub mod validate;
+pub mod registry;
+pub mod ci;
 pub mod hptimer;
 pub mod hptimer_tests;
 pub mod vga_font;
diff --git a/src/machine.rs b/src/machine.rs
index 38d17da..1cc9e6f 100644
--- a/src/machine.rs
+++ b/src/machine.rs
@@ -31,7 +31,8 @@ use crate::hpc3::Hpc3;
 use crate::ioc::Ioc;
 use crate::monitor::Monitor;
 use crate::rex3::Rex3;
-use crate::snapshot::Snapshot;
+use crate::snapshot::{Snapshot, Manifest, SCHEMA_VERSION, ChunksManifest};
+use crate::chunk_store::{ChunkStore, get_chunks_as_words, put_words_as_chunks};
 use crate::hptimer::TimerManager;
 
 pub fn emulator_name() -> &'static str {
@@ -69,10 +70,64 @@ pub struct Machine {
     pub event_tx: mpsc::SyncSender<MachineEvent>,
     event_rx: Option<mpsc::Receiver<MachineEvent>>,
     timer_manager: Arc<TimerManager>,
+    /// When `cfg.ci` is set, the channel-A backend is replaced by this
+    /// in-process one so the CI control socket can drive the console.
+    ci_serial: Option<Arc<crate::z85c30::CiSerialBackend>>,
+    /// Most recent snapshot restored via `ci_restore`. `rollback` reuses this
+    /// name as the fallback path if the in-memory checkpoint is absent.
+    last_restore: Option<String>,
+    /// In-memory copy of the just-loaded state, taken at the end of every
+    /// successful `ci_restore`. Lets `ci_rollback` skip disk IO and TOML
+    /// re-parsing — paste back the cached `toml::Value`s and `memcpy` the
+    /// bank/framebuffer buffers. Cleared on any explicit `load_snapshot`
+    /// outside the CI path.
+    last_restore_checkpoint: Option<RollbackCheckpoint>,
+    /// Path of the configured scratch SCSI volume, if any. The CI socket reads
+    /// and writes this file directly (with the machine briefly stopped) to
+    /// inject/exfiltrate files without going through the network. None when no
+    /// SCSI device has `scratch = true` set in the config.
+    scratch_path: Option<std::path::PathBuf>,
+}
+
+/// In-memory snapshot of the just-restored guest state. Populated at the end
+/// of `ci_restore`; consumed by `ci_rollback`. Trades ~270 MB of RSS for
+/// disk-IO-free rollback.
+struct RollbackCheckpoint {
+    /// Snapshot directory (saves/<name>/) — re-used by rollback to reflink
+    /// the COW overlays back into place.
+    overlay_dir: std::path::PathBuf,
+    /// Per-SCSI-id dirty sector lists from cow.toml at the time of restore.
+    overlay_sets: Vec<(usize, Vec<u64>)>,
+
+    /// Native-endian RAM bank words. `bank_words[i].len() ==
+    /// banks[i].size_bytes / 4` for present banks; populated for all four.
+    bank_words: [Vec<u32>; 4],
+
+    /// Framebuffer contents (RGB, aux). `None` when running headless.
+    framebuffers: Option<(Vec<u32>, Vec<u32>)>,
+
+    /// Parsed device save_state TOMLs. Holding `toml::Value` directly skips
+    /// the ~80 ms cpu.toml string-parse cost on every rollback.
+    cpu: toml::Value,
+    mc: toml::Value,
+    ioc: toml::Value,
+    scc: toml::Value,
+    pit: toml::Value,
+    ps2: toml::Value,
+    rtc: toml::Value,
+    eeprom: toml::Value,
+    scsi: toml::Value,
+    seeq: toml::Value,
+    hpc3: toml::Value,
+    rex3: Option<toml::Value>,
 }
 
 impl Machine {
     pub fn new(cfg: MachineConfig) -> Self {
+        // Capture config flags that are needed after the local `cfg` binding
+        // is shadowed later in this function.
+        let ci_enabled = cfg.ci;
+
         // 0. Shared EEPROM
         let eeprom = Arc::new(Mutex::new(Eeprom93c56::new()));
 
@@ -102,8 +157,22 @@ impl Machine {
         let l1i_fetch_count      = Arc::new(AtomicU64::new(0)); // L1-I fetch counter
         let uncached_fetch_count = Arc::new(AtomicU64::new(0)); // uncached instruction fetches
 
-        // HPC3 (512KB at 0x1FB80000)
-        let ioc = Ioc::new(true);
+        // HPC3 (512KB at 0x1FB80000). CI mode skips the SCC TCP backend
+        // bindings so multiple `--ci` instances can coexist.
+        let ioc = if ci_enabled { Ioc::new_ci(true) } else { Ioc::new(true) };
+
+        // CI mode replaces the default TCP backend on channel B (tty1, the
+        // SGI serial console) with an in-process backend the control socket
+        // drives directly. Channel A (tty2) keeps its default TCP backend.
+        // Must happen before any peripheral `start()` call (which clones the
+        // current backend Arc into the RX/TX threads).
+        let ci_serial = if ci_enabled {
+            let b = Arc::new(crate::z85c30::CiSerialBackend::new());
+            ioc.scc().set_backend_b(b.clone());
+            Some(b)
+        } else {
+            None
+        };
         let timer_manager = Arc::new(TimerManager::new());
         ioc.set_timer_manager(timer_manager.clone());
         ioc.set_heartbeat(heartbeat.clone());
@@ -113,23 +182,68 @@ impl Machine {
         // Attach SCSI devices from config (IDs 1–7).
         let mut scsi_ids: Vec<u8> = cfg.scsi.keys().copied().collect();
         scsi_ids.sort();
+        // CI mode: isolate each COW overlay under /tmp so an interactive
+        // iris holding {base}.overlay can coexist with any number of `--ci`
+        // processes. Files are kept for post-mortem inspection; cleanup
+        // happens on machine drop below.
+        let ci_pid = std::process::id();
+        // Track the on-disk path of any scratch device so the CI socket can
+        // read/write its bytes directly (Phase 2.4).
+        let mut scratch_path: Option<std::path::PathBuf> = None;
         for id in scsi_ids {
             let dev = &cfg.scsi[&id];
-            // For CD-ROMs: build ordered disc list; first entry is mounted now.
-            // For HDDs: disc list is unused (empty).
+            // Scratch volume: pre-create a raw file with a minimal SGI Volume
+            // Header if it doesn't exist. Refuse cdrom/overlay combinations —
+            // scratch must be a host-writable raw file. Default size 64 MB.
+            //
+            // The VH lays out partition 7 ("vol") spanning sectors 8..end and
+            // partition 8 ("vh") spanning sectors 0..7 (the VH itself).
+            // Without a VH, IRIX recognises the device but returns I/O error
+            // on every read because /dev/rdsk/dks0dNvh and /dev/rdsk/dks0dNvol
+            // both consult the partition table at sector 0.
+            //
+            // Convention: host writes payload via scratch-write at offset >=
+            // SCRATCH_PAYLOAD_OFFSET (4096). Guest reads from offset 0 of
+            // /dev/rdsk/dks0dNvol (which maps to sector 8 of the disk by
+            // partition 7's first_block=8).
+            if dev.scratch {
+                if dev.cdrom || dev.overlay {
+                    println!("Note: SCSI ID {}: scratch=true is incompatible with cdrom/overlay; ignoring scratch flag", id);
+                } else {
+                    let path = std::path::Path::new(&dev.path);
+                    if !path.exists() {
+                        let size_mb = dev.size_mb.unwrap_or(64) as u64;
+                        let bytes = size_mb * 1024 * 1024;
+                        match crate::sgi_vh::create_scratch_image(path, bytes) {
+                            Ok(()) => println!("iris: created scratch volume {} ({} MB, with SGI VH)", dev.path, size_mb),
+                            Err(e) => println!("Note: could not create scratch volume {}: {}", dev.path, e),
+                        }
+                    }
+                    if scratch_path.is_some() {
+                        println!("Note: multiple scratch SCSI devices configured; CI socket will use the lowest-id one");
+                    } else {
+                        scratch_path = Some(path.to_path_buf());
+                    }
+                }
+            }
             let (path, discs) = if dev.cdrom {
                 let mut list = dev.discs.clone();
                 if list.is_empty() {
                     list.push(dev.path.clone());
                 } else if list[0] != dev.path {
-                    // Ensure path is front of list if explicitly set
                     list.insert(0, dev.path.clone());
                 }
                 (list[0].clone(), list)
             } else {
                 (dev.path.clone(), vec![])
             };
-            if let Err(e) = hpc3.add_scsi_device(id as usize, &path, dev.cdrom, discs, dev.overlay) {
+            let result = if ci_enabled && dev.overlay && !dev.cdrom {
+                let ci_overlay = format!("/tmp/iris-ci-{}-scsi{}.overlay", ci_pid, id);
+                hpc3.add_scsi_device_with_overlay(id as usize, &path, dev.cdrom, discs, dev.overlay, &ci_overlay)
+            } else {
+                hpc3.add_scsi_device(id as usize, &path, dev.cdrom, discs, dev.overlay)
+            };
+            if let Err(e) = result {
                 println!("Note: Could not attach {} to SCSI ID {}: {}", path, id, e);
             }
         }
@@ -292,7 +406,35 @@ impl Machine {
             event_tx,
             event_rx: Some(event_rx),
             timer_manager,
+            ci_serial,
+            last_restore: None,
+            last_restore_checkpoint: None,
+            scratch_path,
+        }
+    }
+
+    /// Path of the configured scratch SCSI volume, if any. Used by the CI
+    /// socket scratch-{write,read,clear,info} commands to act on the file
+    /// directly while the machine is briefly stopped.
+    pub fn scratch_path(&self) -> Option<&std::path::Path> {
+        self.scratch_path.as_deref()
+    }
+
+    /// Briefly stop the machine, run `work`, then restart peripherals and the
+    /// CPU only if it was running before. Used by the scratch-write/read/clear
+    /// CI commands to mutate the scratch file without racing the SCSI device's
+    /// in-flight reads. CPU stays stopped if the harness hasn't called `start`
+    /// yet — a file injected before boot stays injected, the CPU doesn't get
+    /// auto-started.
+    pub fn with_paused<R>(&mut self, work: impl FnOnce() -> R) -> R {
+        let was_running = self.cpu.is_running();
+        self.stop();
+        let r = work();
+        self.restart_peripherals();
+        if was_running {
+            self.cpu.start();
         }
+        r
     }
 
     pub fn start(&mut self) {
@@ -301,10 +443,19 @@ impl Machine {
         self.hpc3.start();
         if let Some(rex3) = &self._phys.rex3 { rex3.start(); }
 
-        // Start monitor server on localhost:8888
-        self.monitor.clone().start_server("127.0.0.1:8888".to_string());
+        // Monitor server on localhost:8888. Skipped in CI mode — the control
+        // socket replaces it, and binding a fixed port would prevent parallel
+        // `--ci` instances.
+        if self.ci_serial.is_none() {
+            self.monitor.clone().start_server("127.0.0.1:8888".to_string());
+        }
+
+        // CI mode: the harness drives startup via `restore` / `start`. Don't
+        // autostart the CPU so the first command finds a quiet machine.
         #[cfg(not(any(debug_assertions, feature = "developer")))]
-        self.cpu.start();
+        if self.ci_serial.is_none() {
+            self.cpu.start();
+        }
     }
 
     /// Register a SystemController with the monitor so that `reset`, `save`,
@@ -424,6 +575,181 @@ impl Machine {
         MipsCpuDebugAdapter::new(self.cpu.clone())
     }
 
+    /// The in-process serial backend used by `--ci` mode. `None` in
+    /// interactive mode.
+    pub fn get_ci_serial(&self) -> Option<Arc<crate::z85c30::CiSerialBackend>> {
+        self.ci_serial.clone()
+    }
+
+    /// CPU thread, started explicitly by the CI `start` command or by
+    /// `ci_restore`. In `--ci` mode the CPU is not autostarted in `start()`
+    /// — the harness drives startup via `restore`.
+    pub fn cpu_start(&self) {
+        self.cpu.start();
+    }
+
+    /// Step the CPU `n` instructions in-line on the calling thread, with all
+    /// peripheral threads stopped so the CPU sees no external interrupts.
+    /// Used by Phase 3.3 snapshot determinism validator.
+    /// Caller must arrange `load_snapshot_paused` first.
+    pub fn cpu_step_n_inline(&self, n: u64) -> Result<u64, String> {
+        self.cpu.step_n_inline(n)
+    }
+
+    /// Snapshot the deterministic-from-state CPU registers.
+    pub fn cpu_state_digest(&self) -> Result<crate::mips_exec::CpuStateDigest, String> {
+        self.cpu.state_digest()
+    }
+
+    /// Full rewind: load the named snapshot, which now captures the COW
+    /// overlay too so the filesystem state is deterministic per snapshot.
+    /// The CPU resumes automatically (load_snapshot restarts it). After the
+    /// load, an in-memory checkpoint of the just-restored state is taken so
+    /// the next `ci_rollback` can run without touching disk.
+    pub fn ci_restore(&mut self, name: &str) -> Result<(), String> {
+        // Clear any leftover serial bytes from the previous run so the
+        // next command doesn't see stale output.
+        if let Some(ci) = &self.ci_serial {
+            ci.reset();
+        }
+
+        self.load_snapshot(name)?;
+        self.last_restore = Some(name.to_string());
+        // Capture the rollback checkpoint. If this fails, the restore still
+        // succeeded — rollback will fall back to the disk path.
+        match self.capture_rollback_checkpoint(name) {
+            Ok(cp) => self.last_restore_checkpoint = Some(cp),
+            Err(e) => {
+                eprintln!("ci_restore: rollback checkpoint capture failed: {} — rollback will use the disk path", e);
+                self.last_restore_checkpoint = None;
+            }
+        }
+        Ok(())
+    }
+
+    /// Roll back to the state captured at the last `ci_restore`. Uses the
+    /// in-memory checkpoint when present; falls back to a disk reload if it's
+    /// absent (legacy snapshot loaded outside CI, or capture failed).
+    pub fn ci_rollback(&mut self) -> Result<(), String> {
+        if let Some(ci) = &self.ci_serial {
+            ci.reset();
+        }
+
+        // Take the checkpoint out so the apply path can hold &cp without
+        // borrowing self at the same time. Restored after apply so repeated
+        // rollbacks work.
+        let cp = match self.last_restore_checkpoint.take() {
+            Some(cp) => cp,
+            None => {
+                let name = self.last_restore.clone()
+                    .ok_or_else(|| "no previous restore to roll back to".to_string())?;
+                eprintln!("ci_rollback: no in-memory checkpoint — falling back to disk reload");
+                return self.ci_restore(&name);
+            }
+        };
+        let result = self.apply_rollback_checkpoint(&cp);
+        self.last_restore_checkpoint = Some(cp);
+        result
+    }
+
+    /// Capture in-memory state for fast rollback. Stops the CPU briefly.
+    fn capture_rollback_checkpoint(&mut self, name: &str) -> Result<RollbackCheckpoint, String> {
+        self.stop();
+
+        let cpu = self.cpu.save_state();
+        let mc = self.mc.save_state();
+        let ioc = self.hpc3.ioc().save_state();
+        let scc = self.hpc3.ioc().scc().save_state();
+        let pit = self.hpc3.ioc().pit().save_state();
+        let ps2 = self.hpc3.ioc().ps2().save_state();
+        let rtc = self.hpc3.rtc().save_state();
+        let eeprom = self.hpc3.eeprom().lock().save_state_owned();
+        let scsi = self.hpc3.scsi().save_state();
+        let seeq = self.hpc3.seeq().save_state();
+        let hpc3 = self.hpc3.save_state();
+        let rex3 = self._phys.rex3.as_ref().map(|r| r.save_state());
+
+        let bank_words: [Vec<u32>; 4] = [
+            self._phys.snapshot_bank_inmem(0),
+            self._phys.snapshot_bank_inmem(1),
+            self._phys.snapshot_bank_inmem(2),
+            self._phys.snapshot_bank_inmem(3),
+        ];
+
+        let framebuffers = self._phys.rex3.as_ref()
+            .map(|r| r.snapshot_framebuffers_inmem());
+
+        // Re-read cow.toml so rollback knows which dirty sectors to import
+        // back. The file was just consumed by load_snapshot but it's tiny and
+        // re-reading from page cache is cheap (~µs).
+        let overlay_dir = std::path::PathBuf::from("saves").join(name);
+        let snap = Snapshot::new(&overlay_dir);
+        let mut overlay_sets: Vec<(usize, Vec<u64>)> = Vec::new();
+        if let Ok(cow_toml) = snap.read_toml("cow.toml") {
+            if let Some(tbl) = cow_toml.as_table() {
+                for (k, v) in tbl {
+                    let Some(id_str) = k.strip_prefix("scsi") else { continue };
+                    let Ok(id) = id_str.parse::<usize>() else { continue };
+                    let Some(arr) = v.as_array() else { continue };
+                    let dirty: Vec<u64> = arr.iter()
+                        .filter_map(|x| x.as_integer().map(|i| i as u64))
+                        .collect();
+                    overlay_sets.push((id, dirty));
+                }
+            }
+        }
+
+        self.restart_peripherals();
+        self.cpu.start();
+
+        Ok(RollbackCheckpoint {
+            overlay_dir,
+            overlay_sets,
+            bank_words,
+            framebuffers,
+            cpu, mc, ioc, scc, pit, ps2, rtc, eeprom, scsi, seeq, hpc3, rex3,
+        })
+    }
+
+    /// Apply an in-memory checkpoint, restoring the guest to the state at
+    /// the moment of capture. Skips disk IO and TOML string-parsing.
+    fn apply_rollback_checkpoint(&mut self, cp: &RollbackCheckpoint) -> Result<(), String> {
+        self.stop();
+        self.power_on_devices();
+
+        self.cpu.load_state(&cp.cpu)?;
+        self.mc.load_state(&cp.mc)?;
+        self.hpc3.ioc().load_state(&cp.ioc)?;
+        self.hpc3.ioc().scc().load_state(&cp.scc)?;
+        self.hpc3.ioc().pit().load_state(&cp.pit)?;
+        self.hpc3.ioc().ps2().load_state(&cp.ps2)?;
+        self.hpc3.rtc().load_state(&cp.rtc)?;
+        self.hpc3.eeprom().lock().load_state_mut(&cp.eeprom)?;
+        self.hpc3.scsi().load_state(&cp.scsi)?;
+        self.hpc3.seeq().load_state(&cp.seeq)?;
+        self.hpc3.load_state(&cp.hpc3)?;
+        if let (Some(rex3), Some(rex3_toml)) = (&self._phys.rex3, &cp.rex3) {
+            rex3.load_state(rex3_toml)?;
+        }
+
+        for (i, words) in cp.bank_words.iter().enumerate() {
+            self._phys.restore_bank_inmem(i, words);
+        }
+        if let (Some(rex3), Some((rgb, aux))) = (&self._phys.rex3, &cp.framebuffers) {
+            rex3.restore_framebuffers_inmem(rgb, aux);
+        }
+
+        // Reflink the overlay back into place. saves/<name>/scsi*.overlay is
+        // unchanged by guest writes (writes go to the live overlay), so this
+        // can re-import directly.
+        self.hpc3.scsi().import_overlays(&cp.overlay_dir, &cp.overlay_sets)
+            .map_err(|e| format!("rollback: COW overlay import: {}", e))?;
+
+        self.restart_peripherals();
+        self.cpu.start();
+        Ok(())
+    }
+
     /// Restart peripherals (MC, HPC3, REX3) without restarting the monitor server.
     fn restart_peripherals(&mut self) {
         self.mc.start();
@@ -477,135 +803,258 @@ impl Machine {
         let snap = Snapshot::new(&dir);
         snap.ensure_dir().map_err(|e| e.to_string())?;
 
-        // CPU + TLB
-        let cpu_toml = self.cpu.save_state();
-        snap.write_toml("cpu.toml", &cpu_toml).map_err(|e| e.to_string())?;
-
-        // Memory Controller
-        let mc_toml = self.mc.save_state();
-        snap.write_toml("mc.toml", &mc_toml).map_err(|e| e.to_string())?;
-
-        // IOC
-        let ioc_toml = self.hpc3.ioc().save_state();
-        snap.write_toml("ioc.toml", &ioc_toml).map_err(|e| e.to_string())?;
-
-        // SCC (Z85C30 serial)
-        let scc_toml = self.hpc3.ioc().scc().save_state();
-        snap.write_toml("scc.toml", &scc_toml).map_err(|e| e.to_string())?;
-
-        // PIT (8254 timer)
-        let pit_toml = self.hpc3.ioc().pit().save_state();
-        snap.write_toml("pit.toml", &pit_toml).map_err(|e| e.to_string())?;
-
-        // PS2
-        let ps2_toml = self.hpc3.ioc().ps2().save_state();
-        snap.write_toml("ps2.toml", &ps2_toml).map_err(|e| e.to_string())?;
-
-        // RTC (DS1x86)
-        let rtc_toml = self.hpc3.rtc().save_state();
-        snap.write_toml("rtc.toml", &rtc_toml).map_err(|e| e.to_string())?;
-
-        // EEPROM (93C56)
-        let eeprom_toml = self.hpc3.eeprom().lock().save_state_owned();
-        snap.write_toml("eeprom.toml", &eeprom_toml).map_err(|e| e.to_string())?;
-
-        // SCSI (WD33C93A)
-        let scsi_toml = self.hpc3.scsi().save_state();
-        snap.write_toml("scsi.toml", &scsi_toml).map_err(|e| e.to_string())?;
-
-        // Seeq8003 (Ethernet)
-        let seeq_toml = self.hpc3.seeq().save_state();
-        snap.write_toml("seeq.toml", &seeq_toml).map_err(|e| e.to_string())?;
-
-        // HPC3
-        let hpc3_toml = self.hpc3.save_state();
-        snap.write_toml("hpc3.toml", &hpc3_toml).map_err(|e| e.to_string())?;
-
-        // REX3
+        // Write the manifest first so `read_manifest` succeeds even if a later
+        // step crashes — the partial snapshot is at least diagnosable.
+        let mut manifest = Manifest::for_current_save();
+        manifest.parent = self.last_restore.clone();
+        snap.write_manifest(&manifest).map_err(|e| e.to_string())?;
+        let sv = manifest.schema_version;
+
+        // Device state — schema_version=2 writes *.bin (postcard-encoded
+        // BinValue tree); legacy writes *.toml. write_state encapsulates the
+        // choice so this orchestrator stays format-agnostic.
+        snap.write_state("cpu",    &self.cpu.save_state(),                         sv).map_err(|e| e.to_string())?;
+        snap.write_state("mc",     &self.mc.save_state(),                          sv).map_err(|e| e.to_string())?;
+        snap.write_state("ioc",    &self.hpc3.ioc().save_state(),                  sv).map_err(|e| e.to_string())?;
+        snap.write_state("scc",    &self.hpc3.ioc().scc().save_state(),            sv).map_err(|e| e.to_string())?;
+        snap.write_state("pit",    &self.hpc3.ioc().pit().save_state(),            sv).map_err(|e| e.to_string())?;
+        snap.write_state("ps2",    &self.hpc3.ioc().ps2().save_state(),            sv).map_err(|e| e.to_string())?;
+        snap.write_state("rtc",    &self.hpc3.rtc().save_state(),                  sv).map_err(|e| e.to_string())?;
+        snap.write_state("eeprom", &self.hpc3.eeprom().lock().save_state_owned(),  sv).map_err(|e| e.to_string())?;
+        snap.write_state("scsi",   &self.hpc3.scsi().save_state(),                 sv).map_err(|e| e.to_string())?;
+        snap.write_state("seeq",   &self.hpc3.seeq().save_state(),                 sv).map_err(|e| e.to_string())?;
+        snap.write_state("hpc3",   &self.hpc3.save_state(),                        sv).map_err(|e| e.to_string())?;
+
+        // REX3 (optional — absent in headless config). Framebuffers are
+        // included in the chunks manifest below for v3+; v2 wrote them as
+        // standalone .bin files.
         if let Some(rex3) = &self._phys.rex3 {
-            let rex3_toml = rex3.save_state();
-            snap.write_toml("rex3.toml", &rex3_toml).map_err(|e| e.to_string())?;
-            rex3.save_framebuffers(&snap.dir).map_err(|e| e.to_string())?;
+            snap.write_state("rex3", &rex3.save_state(), sv).map_err(|e| e.to_string())?;
+            if sv < 3 {
+                rex3.save_framebuffers(&snap.dir).map_err(|e| e.to_string())?;
+            }
+        }
+
+        // Bulk memory: v3+ goes to the content-addressable chunk store
+        // shared across all snapshots in `saves/.cas/`. v2 (legacy) writes
+        // raw bank{N}.bin files. Chunk hashes go in chunks.bin so load can
+        // walk the right chunks back out.
+        if sv >= 3 {
+            let store = ChunkStore::new("saves");
+            let mut chunks = ChunksManifest::default();
+            for i in 0..4 {
+                let words = self._phys.snapshot_bank_inmem(i);
+                chunks.bank_chunks[i] = put_words_as_chunks(&store, &words)
+                    .map_err(|e| format!("CAS bank{} put: {}", i, e))?;
+            }
+            if let Some(rex3) = &self._phys.rex3 {
+                let (rgb, aux) = rex3.snapshot_framebuffers_inmem();
+                let rgb_chunks = put_words_as_chunks(&store, &rgb)
+                    .map_err(|e| format!("CAS rex3 rgb put: {}", e))?;
+                let aux_chunks = put_words_as_chunks(&store, &aux)
+                    .map_err(|e| format!("CAS rex3 aux put: {}", e))?;
+                chunks.framebuffer_chunks = Some((rgb_chunks, aux_chunks));
+            }
+            snap.write_chunks_manifest(&chunks).map_err(|e| e.to_string())?;
+        } else {
+            for i in 0..4 {
+                self._phys.save_bank(i, dir.join(format!("bank{}.bin", i))).map_err(|e| e.to_string())?;
+            }
         }
 
-        // Bulk memory (raw binary, big-endian word layout) — 4 × 128MB banks
-        for i in 0..4 {
-            self._phys.save_bank(i, dir.join(format!("bank{}.bin", i))).map_err(|e| e.to_string())?;
+        // COW overlays per SCSI device, plus a `cow.toml` with the dirty
+        // sector set for each one. Keeps the on-disk filesystem state
+        // consistent with the captured RAM.
+        let overlays = self.hpc3.scsi().export_overlays(&snap.dir)
+            .map_err(|e| format!("COW overlay export: {}", e))?;
+        let mut cow_tbl = toml::map::Map::new();
+        for (id, dirty) in overlays {
+            let arr: Vec<toml::Value> = dirty.into_iter()
+                .map(|v| toml::Value::Integer(v as i64))
+                .collect();
+            cow_tbl.insert(format!("scsi{}", id), toml::Value::Array(arr));
         }
+        snap.write_toml("cow.toml", &toml::Value::Table(cow_tbl))
+            .map_err(|e| e.to_string())?;
 
         self.restart_peripherals();
+        // Resume execution so the session feels like it never paused.
+        // Without this the user sees JIT shutdown stats and a dead prompt
+        // after `save` — the CPU would otherwise stay stopped.
+        self.cpu.start();
         println!("Snapshot saved to saves/{}", name);
         Ok(())
     }
 
-    /// Restore full machine snapshot from `saves/<name>/`.
+    /// Restore full machine snapshot from `saves/<name>/`. CPU is auto-started
+    /// at the end so the guest resumes from the snapshotted PC.
+    /// For determinism validation use `load_snapshot_paused` instead.
     pub fn load_snapshot(&mut self, name: &str) -> Result<(), String> {
+        self.load_snapshot_inner(name)?;
+        self.cpu.start();
+        println!("Snapshot loaded from saves/{}", name);
+        Ok(())
+    }
+
+    /// Same body as `load_snapshot` but leaves CPU and peripheral threads
+    /// stopped on return. Used by the Phase 3.3 determinism validator which
+    /// must prevent any thread from running between load and digest, since
+    /// thread scheduling jitter would mask CPU determinism issues.
+    pub fn load_snapshot_paused(&mut self, name: &str) -> Result<(), String> {
+        self.load_snapshot_inner(name)?;
+        // load_snapshot_inner restarted peripherals; stop them again.
+        self.hpc3.stop();
+        self.mc.stop();
+        if let Some(rex3) = &self._phys.rex3 { rex3.stop(); }
+        Ok(())
+    }
+
+    /// Restore full machine snapshot from `saves/<name>/`.
+    ///
+    /// JIT-cache invariant: `self.stop()` exits the CPU thread, which drops
+    /// the `CodeCache` owned by `run_jit_dispatch`. Subsequent `cpu.start()`
+    /// (in the public `load_snapshot` wrapper) builds a fresh cache. So no
+    /// explicit invalidation is needed here as long as that ownership
+    /// pattern holds. The persistent JIT profile uses content_hash to skip
+    /// stale entries (see `profile_stale` in dispatch.rs).
+    fn load_snapshot_inner(&mut self, name: &str) -> Result<(), String> {
         self.stop();
 
+        // Any prior in-memory rollback checkpoint is now stale (it described
+        // a different snapshot). ci_restore will recapture if reached via
+        // that path; the monitor `load` command leaves it cleared.
+        self.last_restore_checkpoint = None;
+
         // Reset to clean state before loading
         self.power_on_devices();
 
         let dir = std::path::PathBuf::from("saves").join(name);
         let snap = Snapshot::new(&dir);
 
-        // CPU + TLB
-        let cpu_toml = snap.read_toml("cpu.toml").map_err(|e| e.to_string())?;
-        self.cpu.load_state(&cpu_toml)?;
+        // Validate the manifest before reading anything else. Legacy snapshots
+        // (no snapshot.toml) are accepted with a warning. Cross-arch loads are
+        // refused — FPU bit-layout differs between aarch64 and x86_64 and we
+        // don't have migration plumbing yet.
+        let schema_version = match snap.read_manifest()? {
+            Some(m) => {
+                if m.host_arch != std::env::consts::ARCH {
+                    return Err(format!(
+                        "snapshot host_arch '{}' does not match current host '{}'; cross-arch load is not supported",
+                        m.host_arch, std::env::consts::ARCH
+                    ));
+                }
+                if m.schema_version > SCHEMA_VERSION {
+                    return Err(format!(
+                        "snapshot schema_version {} is newer than this iris build supports ({})",
+                        m.schema_version, SCHEMA_VERSION
+                    ));
+                }
+                if let Some(rev) = &m.iris_git_rev {
+                    if let Some(my_rev) = option_env!("IRIS_GIT_REV") {
+                        if rev != my_rev {
+                            eprintln!("load_snapshot: snapshot was captured at iris {} but current build is {}", rev, my_rev);
+                        }
+                    }
+                }
+                m.schema_version
+            }
+            None => {
+                eprintln!("load_snapshot: no snapshot.toml in {} — treating as legacy v0 (no manifest)", dir.display());
+                0
+            }
+        };
+
+        // Device state — read_state picks <base>.bin (v2+) or <base>.toml
+        // (legacy). v2 also falls back to .toml if .bin is absent.
+        let cpu = snap.read_state("cpu", schema_version).map_err(|e| e.to_string())?;
+        self.cpu.load_state(&cpu)?;
 
-        // Memory Controller
-        let mc_toml = snap.read_toml("mc.toml").map_err(|e| e.to_string())?;
-        self.mc.load_state(&mc_toml)?;
+        let mc = snap.read_state("mc", schema_version).map_err(|e| e.to_string())?;
+        self.mc.load_state(&mc)?;
 
-        // IOC
-        let ioc_toml = snap.read_toml("ioc.toml").map_err(|e| e.to_string())?;
-        self.hpc3.ioc().load_state(&ioc_toml)?;
+        let ioc = snap.read_state("ioc", schema_version).map_err(|e| e.to_string())?;
+        self.hpc3.ioc().load_state(&ioc)?;
 
-        // SCC (Z85C30 serial)
-        let scc_toml = snap.read_toml("scc.toml").map_err(|e| e.to_string())?;
-        self.hpc3.ioc().scc().load_state(&scc_toml)?;
+        let scc = snap.read_state("scc", schema_version).map_err(|e| e.to_string())?;
+        self.hpc3.ioc().scc().load_state(&scc)?;
 
-        // PIT (8254 timer)
-        let pit_toml = snap.read_toml("pit.toml").map_err(|e| e.to_string())?;
-        self.hpc3.ioc().pit().load_state(&pit_toml)?;
+        let pit = snap.read_state("pit", schema_version).map_err(|e| e.to_string())?;
+        self.hpc3.ioc().pit().load_state(&pit)?;
 
-        // PS2
-        let ps2_toml = snap.read_toml("ps2.toml").map_err(|e| e.to_string())?;
-        self.hpc3.ioc().ps2().load_state(&ps2_toml)?;
+        let ps2 = snap.read_state("ps2", schema_version).map_err(|e| e.to_string())?;
+        self.hpc3.ioc().ps2().load_state(&ps2)?;
 
-        // RTC (DS1x86)
-        let rtc_toml = snap.read_toml("rtc.toml").map_err(|e| e.to_string())?;
-        self.hpc3.rtc().load_state(&rtc_toml)?;
+        let rtc = snap.read_state("rtc", schema_version).map_err(|e| e.to_string())?;
+        self.hpc3.rtc().load_state(&rtc)?;
 
-        // EEPROM (93C56)
-        let eeprom_toml = snap.read_toml("eeprom.toml").map_err(|e| e.to_string())?;
-        self.hpc3.eeprom().lock().load_state_mut(&eeprom_toml)?;
+        let eeprom = snap.read_state("eeprom", schema_version).map_err(|e| e.to_string())?;
+        self.hpc3.eeprom().lock().load_state_mut(&eeprom)?;
 
-        // SCSI (WD33C93A)
-        let scsi_toml = snap.read_toml("scsi.toml").map_err(|e| e.to_string())?;
-        self.hpc3.scsi().load_state(&scsi_toml)?;
+        let scsi = snap.read_state("scsi", schema_version).map_err(|e| e.to_string())?;
+        self.hpc3.scsi().load_state(&scsi)?;
 
-        // Seeq8003 (Ethernet)
-        let seeq_toml = snap.read_toml("seeq.toml").map_err(|e| e.to_string())?;
-        self.hpc3.seeq().load_state(&seeq_toml)?;
+        let seeq = snap.read_state("seeq", schema_version).map_err(|e| e.to_string())?;
+        self.hpc3.seeq().load_state(&seeq)?;
 
-        // HPC3
-        let hpc3_toml = snap.read_toml("hpc3.toml").map_err(|e| e.to_string())?;
-        self.hpc3.load_state(&hpc3_toml)?;
+        let hpc3 = snap.read_state("hpc3", schema_version).map_err(|e| e.to_string())?;
+        self.hpc3.load_state(&hpc3)?;
 
-        // REX3
         if let Some(rex3) = &self._phys.rex3 {
-            let rex3_toml = snap.read_toml("rex3.toml").map_err(|e| e.to_string())?;
-            rex3.load_state(&rex3_toml)?;
-            rex3.load_framebuffers(&snap.dir).map_err(|e| e.to_string())?;
+            let rex3_v = snap.read_state("rex3", schema_version).map_err(|e| e.to_string())?;
+            rex3.load_state(&rex3_v)?;
+            // v3+ stores framebuffers in the chunk store; v2 used .bin files.
+            if schema_version < 3 {
+                rex3.load_framebuffers(&snap.dir).map_err(|e| e.to_string())?;
+            }
+        }
+
+        // Bulk memory: v3+ comes from the content-addressable chunk store
+        // shared across snapshots; v2 reads raw bank{N}.bin files.
+        if schema_version >= 3 {
+            let store = ChunkStore::new("saves");
+            let chunks = snap.read_chunks_manifest()
+                .map_err(|e| format!("read chunks.bin: {}", e))?;
+            for (i, hashes) in chunks.bank_chunks.iter().enumerate() {
+                if hashes.is_empty() { continue; }
+                let words = get_chunks_as_words(&store, hashes)
+                    .map_err(|e| format!("CAS bank{} get: {}", i, e))?;
+                self._phys.restore_bank_inmem(i, &words);
+            }
+            if let (Some(rex3), Some((rgb_h, aux_h))) = (&self._phys.rex3, &chunks.framebuffer_chunks) {
+                let rgb = get_chunks_as_words(&store, rgb_h)
+                    .map_err(|e| format!("CAS rex3 rgb get: {}", e))?;
+                let aux = get_chunks_as_words(&store, aux_h)
+                    .map_err(|e| format!("CAS rex3 aux get: {}", e))?;
+                rex3.restore_framebuffers_inmem(&rgb, &aux);
+            }
+        } else {
+            for i in 0..4 {
+                self._phys.load_bank(i, dir.join(format!("bank{}.bin", i))).map_err(|e| e.to_string())?;
+            }
         }
 
-        // Bulk memory — 4 × 128MB banks
-        for i in 0..4 {
-            self._phys.load_bank(i, dir.join(format!("bank{}.bin", i))).map_err(|e| e.to_string())?;
+        // COW overlays — best-effort for backward compatibility with
+        // snapshots saved before overlay capture was added.
+        if let Ok(cow_toml) = snap.read_toml("cow.toml") {
+            let mut sets: Vec<(usize, Vec<u64>)> = Vec::new();
+            if let Some(tbl) = cow_toml.as_table() {
+                for (k, v) in tbl {
+                    let Some(id_str) = k.strip_prefix("scsi") else { continue };
+                    let Ok(id) = id_str.parse::<usize>() else { continue };
+                    let Some(arr) = v.as_array() else { continue };
+                    let dirty: Vec<u64> = arr.iter()
+                        .filter_map(|x| x.as_integer().map(|i| i as u64))
+                        .collect();
+                    sets.push((id, dirty));
+                }
+            }
+            self.hpc3.scsi().import_overlays(&snap.dir, &sets)
+                .map_err(|e| format!("COW overlay import: {}", e))?;
+        } else {
+            eprintln!("load_snapshot: no cow.toml in snapshot — overlays left unchanged");
         }
 
         self.restart_peripherals();
-        println!("Snapshot loaded from saves/{}", name);
         Ok(())
     }
 }
diff --git a/src/main.rs b/src/main.rs
index 28721b1..cf6a76d 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -5,6 +5,12 @@ fn main() {
     let (mut cfg, scale) = load_config();
     let headless = cfg.headless;
     let gdb_port = cfg.gdb_port;
+    let ci_enabled = cfg.ci;
+    let ci_display = cfg.ci_display;
+    let ci_socket_path = cfg.ci_socket.clone();
+
+    // CI control socket will be started after Machine::new below (it needs a
+    // pointer into the constructed Machine).
 
     // Start unfsd before the machine so NFS is ready when IRIX boots.
     // If start_unfsd returns None (directory missing/uncreatable, or binary not found),
@@ -24,6 +30,22 @@ fn main() {
         .unwrap();
     machine.register_system_controller();
 
+    // CI control socket: started after Machine::new so it can hand out the
+    // machine pointer + CiSerialBackend to command handlers.
+    #[cfg(unix)]
+    let _ci_server = if ci_enabled {
+        let mptr: *mut iris::machine::Machine = &mut *machine;
+        match iris::ci::start_server(mptr, &ci_socket_path) {
+            Ok(s) => Some(s),
+            Err(e) => {
+                eprintln!("iris: failed to start CI server: {}", e);
+                std::process::exit(1);
+            }
+        }
+    } else {
+        None
+    };
+
     // DIAG: optionally enable verbose logging from startup via IRIS_DEBUG_LOG.
     // IRIS_DEBUG_LOG="mc,mips" enables those modules. "all" enables everything.
     // Output is broadcast to a stderr sink so jit-diag.sh's tee captures it inline.
@@ -57,14 +79,22 @@ fn main() {
     }
 
     machine.start();
-    std::thread::spawn(|| {
-        Machine::run_console_client();
-    });
-
-    if headless {
-        // Headless mode: no window, no graphics, no audio.
-        // Park the main thread and let the machine run until killed.
-        eprintln!("iris: running headless (no window)");
+    if !ci_enabled {
+        std::thread::spawn(|| {
+            Machine::run_console_client();
+        });
+    }
+
+    let show_window = !headless && !(ci_enabled && !ci_display);
+    if !show_window {
+        if headless {
+            eprintln!("iris: running headless (no REX3, no window)");
+        } else if ci_enabled {
+            eprintln!("iris: --ci mode (REX3 rendering to offscreen buffer, no window)");
+        }
+        // Park the main thread so background threads (CPU, REX3 refresh,
+        // CI socket) keep running. `quit` via the CI socket calls
+        // std::process::exit.
         std::thread::park();
     } else {
         use iris::ui::Ui;
diff --git a/src/mc.rs b/src/mc.rs
index fa67513..a641f67 100644
--- a/src/mc.rs
+++ b/src/mc.rs
@@ -1332,4 +1332,48 @@ mod tests {
         // Read back via BusDevice
         { let _r = mc.read32(MC_BASE + REG_CPUCTRL0); assert!(_r.is_ok(), "Failed to read CPUCTRL0"); assert_eq!(_r.data, val); }
     }
+
+    /// Phase 1.7 round-trip: a fresh MC loaded from a captured save_state must
+    /// re-serialize byte-identically. Catches load_state forgetting a field
+    /// (regs, semaphores, GIO DMA registers) that save_state writes.
+    #[test]
+    fn save_load_round_trip() {
+        let eeprom = Arc::new(Mutex::new(Eeprom93c56::new()));
+        let src = MemoryController::new(eeprom.clone(), true, [128, 128, 0, 0]);
+
+        // Mutate registers and DMA state so we're not testing all-default state.
+        let _ = src.write32(MC_BASE + REG_CPUCTRL0, 0xdead_beef);
+        {
+            let mut s = src.state.lock();
+            s.sys_semaphore = true;
+            s.user_semaphores[0] = true;
+            s.user_semaphores[7] = true;
+            s.user_semaphores[15] = true;
+        }
+        {
+            let mut d = src.giodma.state.lock();
+            d.gio_mask = 0x0000_00ff;
+            d.gio_sub  = 0x0000_0001;
+            d.cause    = 0x0000_0010;
+            d.ctl      = 0x4000_0000;
+            d.memadr   = 0x0800_0000;
+            d.size     = 0x0000_1000;
+            d.stride   = 0x0000_0040;
+            d.gio_adr  = 0x1f00_0000;
+            d.mode     = 0x0000_0007;
+            d.count    = 0x0000_0040;
+            d.run      = 0x0000_0001;
+            d.stdma    = 0x0000_0002;
+            d.tlb_hi[0] = 0xa5a5_a5a5;
+            d.tlb_lo[1] = 0x5a5a_5a5a;
+            d.run_real = true;
+        }
+        let v1 = src.save_state();
+
+        let dst = MemoryController::new(eeprom, true, [128, 128, 0, 0]);
+        dst.load_state(&v1).expect("load_state");
+        let v2 = dst.save_state();
+
+        assert_eq!(v1, v2, "MemoryController save_state mismatch after load_state round-trip");
+    }
 }
diff --git a/src/mem.rs b/src/mem.rs
index 07cef6d..5a1d659 100644
--- a/src/mem.rs
+++ b/src/mem.rs
@@ -82,6 +82,24 @@ impl Memory {
         }
         Ok(())
     }
+
+    /// Clone the bank's word buffer in native endian. Used by the in-memory
+    /// rollback checkpoint to capture state without touching disk. Caller
+    /// should ensure the CPU/peripheral threads are stopped to avoid a
+    /// torn read.
+    pub fn snapshot_words(&self) -> Vec<u32> {
+        let data = unsafe { self.data() };
+        data.to_vec()
+    }
+
+    /// Overwrite the bank's word buffer from `src`. Length is clamped to the
+    /// bank's word count; extra source words are dropped, missing tail words
+    /// are left untouched. Pair with `snapshot_words` for rollback.
+    pub fn restore_words(&self, src: &[u32]) {
+        let data = unsafe { self.data() };
+        let n = src.len().min(data.len());
+        data[..n].copy_from_slice(&src[..n]);
+    }
 }
 
 impl Resettable for Memory {
@@ -321,3 +339,50 @@ impl BusDevice for UnmappedRam {
     fn read64(&self, _addr: u32) -> BusRead64 { BusRead64::ok(0) }
     fn write64(&self, _addr: u32, _v: u64) -> u32 { BUS_OK }
 }
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use crate::traits::BusDevice;
+
+    #[test]
+    fn snapshot_and_restore_words_roundtrip() {
+        let m = Memory::new(1); // 1 MB = 256 K words
+        // Seed unique data via the bus interface so storage layout matches
+        // production access patterns.
+        for i in 0..32 {
+            let addr = (i * 4) as u32;
+            m.write32(addr, 0xDEAD0000 + i as u32);
+        }
+        let snap = m.snapshot_words();
+        // Mutate in place.
+        for i in 0..32 {
+            let addr = (i * 4) as u32;
+            m.write32(addr, 0xCAFEBABE);
+        }
+        // Restore from snapshot, verify original data returns.
+        m.restore_words(&snap);
+        for i in 0..32 {
+            let addr = (i * 4) as u32;
+            let v = m.read32(addr).data;
+            assert_eq!(v, 0xDEAD0000 + i as u32, "word {} mismatch after restore", i);
+        }
+    }
+
+    #[test]
+    fn restore_words_clamps_to_bank_size() {
+        let m = Memory::new(1);
+        let words = m.snapshot_words();
+        let bank_words = words.len();
+        // Source buffer larger than bank — should not panic, should not write
+        // past end.
+        let mut larger = vec![0xAAAAAAAAu32; bank_words + 100];
+        for (i, w) in larger.iter_mut().enumerate().take(bank_words + 100) {
+            *w = (i as u32).wrapping_mul(7);
+        }
+        m.restore_words(&larger);
+        // Spot-check a word inside the bank.
+        let v = m.read32(0).data;
+        assert_eq!(v, 0);
+    }
+}
diff --git a/src/mips_core.rs b/src/mips_core.rs
index 6225efb..c2f5054 100644
--- a/src/mips_core.rs
+++ b/src/mips_core.rs
@@ -80,9 +80,13 @@ pub struct MipsCore {
     /// Shared with the display refresh thread for status bar display.
     pub count_step_atomic: Arc<AtomicU64>,
     /// Cycle count when cp0_compare was last written (0 = never written yet).
-    compare_last_cycles: u64,
-    /// Wall-clock instant when cp0_compare was last written.
-    compare_last_instant: std::time::Instant,
+    /// `pub(crate)` so snapshot load in `mips_exec.rs` can re-anchor the
+    /// calibration after restoring CP0 fields.
+    pub(crate) compare_last_cycles: u64,
+    /// Wall-clock instant when cp0_compare was last written. Reset to
+    /// `Instant::now()` on snapshot load — Instants from a previous run are
+    /// meaningless across a restore.
+    pub(crate) compare_last_instant: std::time::Instant,
     /// Frequency map of CP0 Compare delta values (hardware counts, rounded to nearest 100).
     /// Key = `(delta >> 16) / 100 * 100`, value = number of occurrences.
     #[cfg(feature = "developer_ip7")]
@@ -542,10 +546,23 @@ impl MipsCore {
                 // Formula: count_step = delta * dt_ns / (dc * 1_000_000)
                 //   delta = count units to next compare (what the kernel programmed)
                 //   dc    = instructions executed in last interval
-                //   dt_ns = wall-clock ns elapsed in last interval
-                // = (count units per instruction) * (wall-clock stretch factor)
+                //   dt_ns = ns elapsed in last interval
+                // = (count units per instruction) * (rate-stretch factor)
                 // Only calibrate for ~1ms timer intervals (IRIX 1000 Hz scheduler);
                 // leave count_step unchanged for other timer uses (one-shot, low-freq).
+                //
+                // Two clock sources, gated by `ci_clock`:
+                //   default (interactive desktop): dt_ns from host `Instant::now()`,
+                //     so the guest timer tracks real wall-clock. Sensitive to host
+                //     scheduling jitter, but that's the price of a real-time desktop.
+                //   --features ci_clock: dt_ns = dc * 10 ns (synthetic R4400 ~100 MIPS).
+                //     Decouples guest-perceived time from host scheduling so the Phase
+                //     3.3 snapshot determinism validator passes at any N. Tradeoff: a
+                //     CI run that takes 5 host minutes may present as 30 guest minutes
+                //     depending on host MIPS — exactly what reproducible CI wants.
+                #[cfg(feature = "ci_clock")]
+                const NS_PER_GUEST_CYCLE: u64 = 10;
+                #[cfg(not(feature = "ci_clock"))]
                 let now = std::time::Instant::now();
                 let cycles_now = self.local_cycles;
                 // Compute new_delta before the calibration block so we can guard on it.
@@ -554,9 +571,13 @@ impl MipsCore {
                 let new_delta = self.cp0_compare.wrapping_sub(self.cp0_count);
                 if new_delta >> 63 != 0 {
                     self.compare_last_cycles = cycles_now;
-                    self.compare_last_instant = now;
+                    #[cfg(not(feature = "ci_clock"))]
+                    { self.compare_last_instant = now; }
                 } else if self.compare_last_cycles != 0 {
                     let dc = cycles_now.wrapping_sub(self.compare_last_cycles);
+                    #[cfg(feature = "ci_clock")]
+                    let dt_ns = dc.saturating_mul(NS_PER_GUEST_CYCLE);
+                    #[cfg(not(feature = "ci_clock"))]
                     let dt_ns = now.duration_since(self.compare_last_instant).as_nanos() as u64;
                     // new_delta: what the *next* interval will fire at, stored as 32.32 fp.
                     #[cfg(feature = "developer_ip7")]
@@ -598,7 +619,8 @@ impl MipsCore {
                 }
                 // First write: keep default count_step (1<<15), just record state.
                 self.compare_last_cycles = cycles_now;
-                self.compare_last_instant = now;
+                #[cfg(not(feature = "ci_clock"))]
+                { self.compare_last_instant = now; }
             }
             12 => {
                 let old = self.cp0_status;
diff --git a/src/mips_exec.rs b/src/mips_exec.rs
index 6baaf57..a3674ff 100644
--- a/src/mips_exec.rs
+++ b/src/mips_exec.rs
@@ -4820,6 +4820,106 @@ impl<T: Tlb + Send + 'static, C: MipsCache + Send + 'static> MipsCpu<T, C> {
     fn try_lock_executor(&self) -> Result<parking_lot::MutexGuard<MipsExecutor<T, C>>, String> {
         self.executor.try_lock().ok_or_else(|| "CPU thread holds the executor lock; try 'cpu stop' first".to_string())
     }
+
+    /// Step the executor `n` times in-line on the calling thread. Caller must
+    /// have stopped the runtime CPU thread first (otherwise we deadlock on
+    /// the executor mutex). Returns the number of steps actually executed —
+    /// will be `< n` only if the CPU stops itself (e.g. soft-reset).
+    ///
+    /// Used by Phase 3.3 snapshot determinism validator. Single-threaded,
+    /// no thread scheduling jitter, so two runs from identical state should
+    /// reach identical state after the same number of steps.
+    pub fn step_n_inline(&self, n: u64) -> Result<u64, String> {
+        let mut exec = self.try_lock_executor()?;
+        let mut executed = 0u64;
+        for _ in 0..n {
+            let _status = exec.step();
+            executed += 1;
+            // Don't break on exceptions — they're part of normal CPU
+            // operation and a deterministic run should re-enter and continue.
+        }
+        exec.flush_cycles();
+        Ok(executed)
+    }
+
+    /// Snapshot the deterministic-from-state CPU registers. Excludes host
+    /// wallclock anchors like `compare_last_instant` (they're meaningless
+    /// across runs) but includes their calibrated equivalents (count_step,
+    /// compare_delta_*).
+    pub fn state_digest(&self) -> Result<CpuStateDigest, String> {
+        let exec = self.try_lock_executor()?;
+        let c = &exec.core;
+        Ok(CpuStateDigest {
+            gpr: c.gpr,
+            pc: c.pc,
+            hi: c.hi,
+            lo: c.lo,
+            cp0_count: c.cp0_count,
+            cp0_compare: c.cp0_compare,
+            cp0_status: c.cp0_status,
+            cp0_cause: c.cp0_cause,
+            cp0_epc: c.cp0_epc,
+            cp0_badvaddr: c.cp0_badvaddr,
+            cp0_entryhi: c.cp0_entryhi,
+            count_step: c.count_step,
+            in_delay_slot: exec.in_delay_slot,
+        })
+    }
+}
+
+/// Deterministic-from-state CPU register snapshot. Excludes host wallclock
+/// anchors so two runs from the same starting state can be diffed cleanly.
+/// `local_cycles` is intentionally not included — it's a runtime perf counter
+/// that's not part of save_state and stays stale across `load_snapshot`.
+#[derive(Debug, Clone, PartialEq, Eq)]
+pub struct CpuStateDigest {
+    pub gpr: [u64; 32],
+    pub pc: u64,
+    pub hi: u64,
+    pub lo: u64,
+    pub cp0_count: u64,
+    pub cp0_compare: u64,
+    pub cp0_status: u32,
+    pub cp0_cause: u32,
+    pub cp0_epc: u64,
+    pub cp0_badvaddr: u64,
+    pub cp0_entryhi: u64,
+    pub count_step: u64,
+    pub in_delay_slot: bool,
+}
+
+impl CpuStateDigest {
+    /// Return a list of (field_name, lhs_repr, rhs_repr) for every field that
+    /// differs. Empty if states are bit-identical. For arrays, only diverging
+    /// indices are reported.
+    pub fn diff(&self, other: &CpuStateDigest) -> Vec<(String, String, String)> {
+        let mut out = Vec::new();
+        for (i, (a, b)) in self.gpr.iter().zip(other.gpr.iter()).enumerate() {
+            if a != b {
+                out.push((format!("gpr[{}]", i), format!("0x{:016x}", a), format!("0x{:016x}", b)));
+            }
+        }
+        macro_rules! cmp {
+            ($name:ident, $fmt:expr) => {
+                if self.$name != other.$name {
+                    out.push((stringify!($name).to_string(), format!($fmt, self.$name), format!($fmt, other.$name)));
+                }
+            };
+        }
+        cmp!(pc,           "0x{:016x}");
+        cmp!(hi,           "0x{:016x}");
+        cmp!(lo,           "0x{:016x}");
+        cmp!(cp0_count,    "0x{:016x}");
+        cmp!(cp0_compare,  "0x{:016x}");
+        cmp!(cp0_status,   "0x{:08x}");
+        cmp!(cp0_cause,    "0x{:08x}");
+        cmp!(cp0_epc,      "0x{:016x}");
+        cmp!(cp0_badvaddr, "0x{:016x}");
+        cmp!(cp0_entryhi,  "0x{:016x}");
+        cmp!(count_step,   "{}");
+        cmp!(in_delay_slot, "{}");
+        out
+    }
 }
 
 fn is_call_instruction(instr: u32) -> bool {
@@ -5951,7 +6051,16 @@ impl<T: Tlb + Send + 'static, C: MipsCache + Send + 'static> Saveable for MipsCp
             ($f:ident) => { cp0.insert(stringify!($f).into(), hex_u64(c.$f)); }
         }
         cp0u32!(cp0_index); cp0u32!(cp0_random); cp0u32!(cp0_wired);
-        cp0u64!(cp0_count); cp0u64!(cp0_compare); cp0u32!(cp0_status); cp0u32!(cp0_cause);
+        cp0u64!(cp0_count); cp0u64!(cp0_compare);
+        // Timer calibration state. Without these, restore loses the kernel's
+        // learned tick rate and runs at the default count_step until IRIX
+        // touches Compare again — guest scheduler drifts noticeably for the
+        // first few seconds after every restore. compare_last_cycles and
+        // compare_last_instant are intentionally not saved: they're host-wall
+        // anchors, not calibrated state, and must be reset on load.
+        cp0u64!(count_step); cp0u64!(compare_delta_prev);
+        cp0u64!(compare_delta_slow); cp0u64!(compare_delta_fast);
+        cp0u32!(cp0_status); cp0u32!(cp0_cause);
         cp0u32!(cp0_prid); cp0u32!(cp0_config); cp0u32!(cp0_lladdr);
         cp0u32!(cp0_watchlo); cp0u32!(cp0_watchhi); cp0u32!(cp0_ecc); cp0u32!(cp0_cacheerr);
         cp0u32!(cp0_taglo); cp0u32!(cp0_taghi);
@@ -6005,6 +6114,19 @@ impl<T: Tlb + Send + 'static, C: MipsCache + Send + 'static> Saveable for MipsCp
             }}
             ld32!(cp0_index); ld32!(cp0_random); ld32!(cp0_wired);
             ld64!(cp0_count); ld64!(cp0_compare);
+            ld64!(count_step); ld64!(compare_delta_prev);
+            ld64!(compare_delta_slow); ld64!(compare_delta_fast);
+            // Mirror count_step into its atomic shadow (read by the display
+            // thread) so the live UI matches the restored core state.
+            c.count_step_atomic.store(c.count_step, std::sync::atomic::Ordering::Relaxed);
+            // Re-anchor the host-wall calibration timer. Setting cycles to 0
+            // forces the next CP0 Compare write to take the "first write"
+            // path (no calibration), which is what we want — dt_ns measured
+            // against an Instant from the previous run would be garbage. The
+            // saved count_step keeps the rate steady until calibration catches
+            // up over the next few Compare writes.
+            c.compare_last_cycles = 0;
+            c.compare_last_instant = std::time::Instant::now();
             ld32!(cp0_status); ld32!(cp0_cause); ld32!(cp0_prid);
             ld32!(cp0_config); ld32!(cp0_lladdr); ld32!(cp0_watchlo); ld32!(cp0_watchhi);
             ld32!(cp0_ecc); ld32!(cp0_cacheerr); ld32!(cp0_taglo); ld32!(cp0_taghi);
diff --git a/src/mips_tlb.rs b/src/mips_tlb.rs
index 56ffa2c..7e7c96c 100644
--- a/src/mips_tlb.rs
+++ b/src/mips_tlb.rs
@@ -630,6 +630,16 @@ impl Tlb for MipsTlb {
                 self.vmap_fill(i);
             }
         }
+        // Reset MRU lists to canonical post-power-on order. Without this, two
+        // restores of the same snapshot can have different `tlbwr` victims if
+        // the prior session's MRU history leaks in.
+        for list in 0..MRU_LISTS {
+            self.mru_head[list] = 0;
+            for i in 0..TLB_NUM_ENTRIES - 1 {
+                self.mru_next[list][i] = (i + 1) as u8;
+            }
+            self.mru_next[list][TLB_NUM_ENTRIES - 1] = MRU_NONE;
+        }
         Ok(())
     }
 
diff --git a/src/mips_tlb_test.rs b/src/mips_tlb_test.rs
index 2948d55..6eb7eab 100644
--- a/src/mips_tlb_test.rs
+++ b/src/mips_tlb_test.rs
@@ -283,4 +283,29 @@ mod tests {
         let result = tlb.probe(va_wrong_r, asid, true);
         assert_eq!(result & 0x80000000, 0x80000000, "Expected probe to miss with wrong R field");
     }
+
+    /// Phase 1.7 round-trip: a fresh TLB loaded from a captured save_state must
+    /// re-serialize byte-identically. Catches load_state forgetting a field
+    /// that save_state writes.
+    #[test]
+    fn save_load_round_trip() {
+        let mut src = MipsTlb::new(TLB_NUM_ENTRIES);
+        // Write a few entries with varied bit patterns so we're not just
+        // testing all-zero defaults.
+        for (slot, vpn2) in [(0usize, 0x100u64), (5, 0x800), (17, 0x4000), (47, 0xffff)].iter().copied() {
+            let mut e = TlbEntry::new();
+            e.page_mask = (slot as u64) << 13;
+            e.entry_hi  = (2u64 << 62) | (vpn2 << 13) | (slot as u64 & 0xff);
+            e.entry_lo0 = ((slot as u64) << 6) | (3 << 3) | 0x6;
+            e.entry_lo1 = ((slot as u64 + 1) << 6) | (3 << 3) | 0x6;
+            src.write(slot, e);
+        }
+        let v1 = src.save_state();
+
+        let mut dst = MipsTlb::new(TLB_NUM_ENTRIES);
+        dst.load_state(&v1).expect("load_state");
+        let v2 = dst.save_state();
+
+        assert_eq!(v1, v2, "MipsTlb save_state mismatch after load_state round-trip");
+    }
 }
diff --git a/src/net.rs b/src/net.rs
index 09f31a4..52e980b 100644
--- a/src/net.rs
+++ b/src/net.rs
@@ -1057,26 +1057,39 @@ impl NatEngine {
 
     // ── NFS destination remapping ─────────────────────────────────────────────
     //
-    // IRIX talks to 192.168.0.1 on VM-visible NFS/mountd ports.  Rewrite the
-    // destination to 127.0.0.1 on the high host-side ports where unfsd listens.
+    // Rewrite guest outbound destination to a host-reachable address.
+    //
+    // IRIX sees the gateway at 192.168.0.1 but that's a virtual address iris
+    // doesn't actually bind to, so unmodified TcpStream::connect() fails. We
+    // rewrite any gateway-destined packet to 127.0.0.1. NFS ports additionally
+    // shift to the high host ports where unfsd listens.
     fn nfs_remap_dst(&self, dst_ip: Ipv4Addr, dport: u16) -> (Ipv4Addr, u16) {
-        let Some(nfs) = &self.config.nfs else { return (dst_ip, dport); };
         if dst_ip != self.config.gateway_ip { return (dst_ip, dport); }
-        match dport {
-            NFS_VM_PORT    => (Ipv4Addr::LOCALHOST, nfs.nfs_host_port),
-            MOUNTD_VM_PORT => (Ipv4Addr::LOCALHOST, nfs.mountd_host_port),
-            _              => (dst_ip, dport),
+        if let Some(nfs) = &self.config.nfs {
+            match dport {
+                NFS_VM_PORT    => return (Ipv4Addr::LOCALHOST, nfs.nfs_host_port),
+                MOUNTD_VM_PORT => return (Ipv4Addr::LOCALHOST, nfs.mountd_host_port),
+                _ => {}
+            }
         }
+        // Generic outbound: guest→gateway becomes guest→host loopback on
+        // the same port. Lets the guest reach any service the host is
+        // running on 127.0.0.1:<dport> (pyftpdlib on 2121, python -m
+        // http.server, etc.).
+        (Ipv4Addr::LOCALHOST, dport)
     }
 
-    // Reverse: translate (127.0.0.1, host_port) back to (192.168.0.1, vm_port)
-    // so replies to IRIX appear to come from the gateway on the standard NFS ports.
+    // Reverse: translate (127.0.0.1, host_port) back to the address the guest
+    // dialed, so replies look like they came from the gateway.
     fn nfs_unmap_src(&self, src_ip: Ipv4Addr, sport: u16) -> (Ipv4Addr, u16) {
-        let Some(nfs) = &self.config.nfs else { return (src_ip, sport); };
         if src_ip != Ipv4Addr::LOCALHOST { return (src_ip, sport); }
-        if sport == nfs.nfs_host_port    { return (self.config.gateway_ip, NFS_VM_PORT);    }
-        if sport == nfs.mountd_host_port { return (self.config.gateway_ip, MOUNTD_VM_PORT); }
-        (src_ip, sport)
+        if let Some(nfs) = &self.config.nfs {
+            if sport == nfs.nfs_host_port    { return (self.config.gateway_ip, NFS_VM_PORT);    }
+            if sport == nfs.mountd_host_port { return (self.config.gateway_ip, MOUNTD_VM_PORT); }
+        }
+        // Generic outbound: reply from host-side dport becomes gateway:dport
+        // to the guest.
+        (self.config.gateway_ip, sport)
     }
 
     // ── Portmap (port 111) — tiny inline RPC GETPORT responder ───────────────
diff --git a/src/physical.rs b/src/physical.rs
index fcb87fd..0dbc4cf 100644
--- a/src/physical.rs
+++ b/src/physical.rs
@@ -238,6 +238,17 @@ impl Physical {
             bank.power_on();
         }
     }
+
+    /// Snapshot bank `bank` into a native-endian Vec<u32>. Used by the
+    /// in-memory rollback checkpoint to skip the disk byte-shuffle.
+    pub fn snapshot_bank_inmem(&self, bank: usize) -> Vec<u32> {
+        self.banks[bank].snapshot_words()
+    }
+
+    /// Restore bank `bank` from a buffer produced by `snapshot_bank_inmem`.
+    pub fn restore_bank_inmem(&self, bank: usize, src: &[u32]) {
+        self.banks[bank].restore_words(src);
+    }
 }
 
 impl Physical {
diff --git a/src/pit8254.rs b/src/pit8254.rs
index 24bcfe8..519168c 100644
--- a/src/pit8254.rs
+++ b/src/pit8254.rs
@@ -593,6 +593,25 @@ mod tests {
         assert!(delta < 1000,
             "count={} expected ~{} (delta={})", count, expected, delta);
     }
+
+    /// Phase 1.7 round-trip: program a few channels with non-default values,
+    /// save, load into a fresh PIT, save again, assert the two save_states are
+    /// byte-identical. Catches load_state forgetting a channel field.
+    #[test]
+    fn save_load_round_trip() {
+        let _lock = SERIAL.lock().unwrap();
+        let src = make_pit(1_000_000);
+        program_mode2(&src, 0, 0x1234);
+        program_mode2(&src, 1, 0x5678);
+        program_mode2(&src, 2, 0xabcd);
+        let v1 = src.save_state();
+
+        let dst = make_pit(1_000_000);
+        dst.load_state(&v1).expect("load_state");
+        let v2 = dst.save_state();
+
+        assert_eq!(v1, v2, "Pit8254 save_state mismatch after load_state round-trip");
+    }
 }
 
 impl Saveable for Pit8254 {
diff --git a/src/ps2.rs b/src/ps2.rs
index 908d918..b881e6a 100644
--- a/src/ps2.rs
+++ b/src/ps2.rs
@@ -910,6 +910,42 @@ impl Saveable for Ps2Controller {
     }
 }
 
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    /// Phase 1.7 round-trip: a fresh PS/2 controller loaded from a captured
+    /// save_state must re-serialize byte-identically. Mutates rx_queue,
+    /// command_state, and assorted flag/byte fields so the test exercises every
+    /// branch of load_state.
+    #[test]
+    fn save_load_round_trip() {
+        let src = Ps2Controller::new(None);
+        {
+            let mut s = src.state.lock();
+            s.rx_queue.push_back((0xfa, Ps2Source::Keyboard));
+            s.rx_queue.push_back((0x42, Ps2Source::Mouse));
+            s.rx_queue.push_back((0xee, Ps2Source::MouseCmd));
+            s.mouse_queue_bytes = 1;
+            s.next_write_is_mouse = true;
+            s.led_state = 0x07;
+            s.scancode_set = 1;
+            s.config = 0x65;
+            s.command_state = CommandState::SetTypematic;
+            s.scanning_enabled = true;
+            s.mouse_enabled = true;
+            s.last_read = 0x12;
+        }
+        let v1 = src.save_state();
+
+        let dst = Ps2Controller::new(None);
+        dst.load_state(&v1).expect("load_state");
+        let v2 = dst.save_state();
+
+        assert_eq!(v1, v2, "Ps2Controller save_state mismatch after load_state round-trip");
+    }
+}
+
 #[derive(Debug, Clone, Copy, PartialEq, Eq)]
 #[repr(u16)]
 pub enum ScancodeSet1 {
diff --git a/src/registry.rs b/src/registry.rs
new file mode 100644
index 0000000..1df270b
--- /dev/null
+++ b/src/registry.rs
@@ -0,0 +1,526 @@
+//! Phase 3.4: HTTP snapshot registry.
+//!
+//! Pull/push iris snapshots between machines, Docker-layer-style. The CAS
+//! chunk store from Phase 3.1 makes this nearly free at the wire level: only
+//! chunks that the receiving side doesn't already have are transferred.
+//!
+//! ## URL layout
+//!
+//! Mirrors the on-disk layout, so any static file server pointing at
+//! `saves/` (e.g. `python3 -m http.server` running in your saves directory)
+//! works as a read-only pull source. Push needs a server that accepts PUT.
+//!
+//! ```text
+//!   GET  <base>/snapshots/<name>/snapshot.toml      ← manifest, schema_version
+//!   GET  <base>/snapshots/<name>/cpu.bin            ← v2+ device state
+//!   GET  <base>/snapshots/<name>/chunks.bin         ← v3+ CAS hash list
+//!   GET  <base>/snapshots/<name>/cow.toml           ← per-SCSI dirty sectors
+//!   GET  <base>/snapshots/<name>/scsi1.overlay      ← per-SCSI overlay bytes
+//!   GET  <base>/cas/<hex2>/<hex62>.chunk            ← content-addressed RAM chunk
+//! ```
+//!
+//! ## Wire format
+//!
+//! Hand-rolled HTTP/1.1 over `std::net::TcpStream` — no new dependency. HTTP
+//! only (no TLS); use behind a tunnel or trusted network. Single-request,
+//! single-connection — no keep-alive. Plenty for snapshot transfers because
+//! the per-request overhead is dwarfed by the chunk payload.
+//!
+//! ## Commit ordering
+//!
+//! Push uploads chunks first, then `snapshot.toml` LAST. An interrupted push
+//! leaves orphan chunks (which `gc` on the server side will sweep) but never
+//! a half-published snapshot manifest pointing at missing chunks. Pull
+//! validates `chunks.bin` against fetched chunks at the end so a torn pull
+//! is detectable.
+
+use std::io::{Read, Write};
+use std::net::{TcpStream, ToSocketAddrs};
+use std::path::{Path, PathBuf};
+use std::time::Duration;
+
+use crate::chunk_store::{ChunkHash, ChunkStore};
+use crate::snapshot::{ChunksManifest, Snapshot};
+
+const HTTP_TIMEOUT: Duration = Duration::from_secs(60);
+
+/// Outcome of a `pull` or `push` operation. JSON-serializable for the CI socket.
+#[derive(Debug, Clone, Default)]
+pub struct TransferReport {
+    pub chunks_fetched: u64,
+    pub chunks_skipped: u64,
+    pub bytes_transferred: u64,
+    pub files_transferred: u64,
+}
+
+/// Pull a snapshot from `base_url` into the local `saves_dir`. Idempotent —
+/// chunks already in the local store are not re-downloaded. Returns a
+/// transfer report.
+pub fn pull(base_url: &str, name: &str, saves_dir: &Path) -> Result<TransferReport, String> {
+    if name.is_empty() || name.contains("..") {
+        return Err("pull: invalid snapshot name".into());
+    }
+    let base = base_url.trim_end_matches('/');
+    let mut report = TransferReport::default();
+
+    let snap_dir = saves_dir.join(name);
+    std::fs::create_dir_all(&snap_dir).map_err(|e| format!("create {}: {}", snap_dir.display(), e))?;
+
+    // 1. Manifest first — tells us schema_version, which gates which other
+    //    files exist.
+    let manifest_url = format!("{}/snapshots/{}/snapshot.toml", base, name);
+    let manifest_bytes = http_get(&manifest_url).map_err(|e| format!("fetch manifest: {}", e))?;
+    std::fs::write(snap_dir.join("snapshot.toml"), &manifest_bytes)
+        .map_err(|e| format!("write snapshot.toml: {}", e))?;
+    report.files_transferred += 1;
+    report.bytes_transferred += manifest_bytes.len() as u64;
+
+    let snap_local = Snapshot::new(&snap_dir);
+    let manifest = snap_local
+        .read_manifest()
+        .map_err(|e| format!("parse manifest: {}", e))?
+        .ok_or_else(|| "manifest missing or unparseable".to_string())?;
+    let sv = manifest.schema_version;
+
+    // 2. Per-device state. v3+ uses .bin (postcard); v1 used .toml. v0 has no
+    //    manifest so we don't get here.
+    let device_bases = [
+        "cpu", "mc", "ioc", "scc", "pit", "ps2", "rtc",
+        "eeprom", "scsi", "seeq", "hpc3", "rex3",
+    ];
+    let suffix = if sv >= 2 { "bin" } else { "toml" };
+    for base_name in device_bases {
+        let url = format!("{}/snapshots/{}/{}.{}", base, name, base_name, suffix);
+        match http_get(&url) {
+            Ok(bytes) => {
+                std::fs::write(snap_dir.join(format!("{}.{}", base_name, suffix)), &bytes)
+                    .map_err(|e| format!("write {}.{}: {}", base_name, suffix, e))?;
+                report.files_transferred += 1;
+                report.bytes_transferred += bytes.len() as u64;
+            }
+            Err(e) if e.contains("404") => {
+                // rex3 is optional (headless configs skip it); other devices
+                // could in principle be absent in a future config.
+                continue;
+            }
+            Err(e) => return Err(format!("fetch {}.{}: {}", base_name, suffix, e)),
+        }
+    }
+
+    // 3. cow.toml (overlay dirty sector lists) and the scsi*.overlay files.
+    if let Ok(bytes) = http_get(&format!("{}/snapshots/{}/cow.toml", base, name)) {
+        std::fs::write(snap_dir.join("cow.toml"), &bytes)
+            .map_err(|e| format!("write cow.toml: {}", e))?;
+        report.files_transferred += 1;
+        report.bytes_transferred += bytes.len() as u64;
+
+        if let Ok(text) = std::str::from_utf8(&bytes) {
+            if let Ok(toml::Value::Table(t)) = text.parse::<toml::Value>() {
+                for (key, _) in t {
+                    if let Some(id_str) = key.strip_prefix("scsi") {
+                        if id_str.parse::<usize>().is_ok() {
+                            let fname = format!("{}.overlay", key);
+                            let url = format!("{}/snapshots/{}/{}", base, name, fname);
+                            if let Ok(b) = http_get(&url) {
+                                std::fs::write(snap_dir.join(&fname), &b)
+                                    .map_err(|e| format!("write {}: {}", fname, e))?;
+                                report.files_transferred += 1;
+                                report.bytes_transferred += b.len() as u64;
+                            }
+                        }
+                    }
+                }
+            }
+        }
+    }
+
+    // 4. v3+: fetch chunks.bin, then any chunk hashes the local store doesn't
+    //    already have.
+    if sv >= 3 {
+        let chunks_url = format!("{}/snapshots/{}/chunks.bin", base, name);
+        let chunks_bytes = http_get(&chunks_url).map_err(|e| format!("fetch chunks.bin: {}", e))?;
+        let chunks: ChunksManifest = postcard::from_bytes(&chunks_bytes)
+            .map_err(|e| format!("parse chunks.bin: {}", e))?;
+        std::fs::write(snap_dir.join("chunks.bin"), &chunks_bytes)
+            .map_err(|e| format!("write chunks.bin: {}", e))?;
+        report.files_transferred += 1;
+        report.bytes_transferred += chunks_bytes.len() as u64;
+
+        let store = ChunkStore::new(saves_dir);
+        let mut seen: std::collections::HashSet<ChunkHash> = std::collections::HashSet::new();
+        for hash in chunks.referenced_hashes() {
+            if !seen.insert(*hash) {
+                continue;
+            }
+            if store.has(hash) {
+                report.chunks_skipped += 1;
+                continue;
+            }
+            let url = format!("{}/cas/{}/{}.chunk", base, hex2_of(hash), hex62_of(hash));
+            let bytes = http_get(&url).map_err(|e| format!("fetch chunk {}: {}", hex_of(hash), e))?;
+            // Validate the server gave us the right content.
+            let actual: ChunkHash = blake3::hash(&bytes).into();
+            if &actual != hash {
+                return Err(format!(
+                    "chunk hash mismatch for {}: got {}",
+                    hex_of(hash),
+                    hex_of(&actual)
+                ));
+            }
+            store
+                .put(&bytes)
+                .map_err(|e| format!("store chunk {}: {}", hex_of(hash), e))?;
+            report.chunks_fetched += 1;
+            report.bytes_transferred += bytes.len() as u64;
+        }
+    }
+
+    Ok(report)
+}
+
+/// Push a local snapshot to `base_url`. Uploads only chunks the server
+/// doesn't already have. Manifest goes LAST so an interrupted push never
+/// leaves a half-committed snapshot.
+pub fn push(base_url: &str, name: &str, saves_dir: &Path) -> Result<TransferReport, String> {
+    if name.is_empty() || name.contains("..") {
+        return Err("push: invalid snapshot name".into());
+    }
+    let base = base_url.trim_end_matches('/');
+    let mut report = TransferReport::default();
+
+    let snap_dir = saves_dir.join(name);
+    if !snap_dir.is_dir() {
+        return Err(format!("push: snapshot '{}' not found", name));
+    }
+    let snap_local = Snapshot::new(&snap_dir);
+    let manifest = snap_local
+        .read_manifest()
+        .map_err(|e| format!("read manifest: {}", e))?
+        .ok_or_else(|| "manifest missing — only v1+ snapshots can be pushed".to_string())?;
+    let sv = manifest.schema_version;
+
+    // 1. Chunks first (v3+). Manifest goes last so the snapshot only becomes
+    //    visible to pullers once all its chunks are in place.
+    if sv >= 3 {
+        let chunks_path = snap_dir.join("chunks.bin");
+        let chunks_bytes = std::fs::read(&chunks_path).map_err(|e| format!("read chunks.bin: {}", e))?;
+        let chunks: ChunksManifest = postcard::from_bytes(&chunks_bytes)
+            .map_err(|e| format!("parse chunks.bin: {}", e))?;
+        let store = ChunkStore::new(saves_dir);
+        let mut seen: std::collections::HashSet<ChunkHash> = std::collections::HashSet::new();
+        for hash in chunks.referenced_hashes() {
+            if !seen.insert(*hash) {
+                continue;
+            }
+            let url = format!("{}/cas/{}/{}.chunk", base, hex2_of(hash), hex62_of(hash));
+            if http_head(&url).unwrap_or(false) {
+                report.chunks_skipped += 1;
+                continue;
+            }
+            let bytes = store
+                .get(hash)
+                .map_err(|e| format!("read chunk {}: {}", hex_of(hash), e))?;
+            http_put(&url, &bytes).map_err(|e| format!("PUT chunk {}: {}", hex_of(hash), e))?;
+            report.chunks_fetched += 1;
+            report.bytes_transferred += bytes.len() as u64;
+        }
+    }
+
+    // 2. Per-device state.
+    let device_bases = [
+        "cpu", "mc", "ioc", "scc", "pit", "ps2", "rtc",
+        "eeprom", "scsi", "seeq", "hpc3", "rex3",
+    ];
+    let suffix = if sv >= 2 { "bin" } else { "toml" };
+    for base_name in device_bases {
+        let p = snap_dir.join(format!("{}.{}", base_name, suffix));
+        if !p.exists() {
+            continue; // rex3 may legitimately be absent
+        }
+        let bytes = std::fs::read(&p).map_err(|e| format!("read {}: {}", p.display(), e))?;
+        let url = format!("{}/snapshots/{}/{}.{}", base, name, base_name, suffix);
+        http_put(&url, &bytes).map_err(|e| format!("PUT {}.{}: {}", base_name, suffix, e))?;
+        report.files_transferred += 1;
+        report.bytes_transferred += bytes.len() as u64;
+    }
+
+    // 3. cow.toml + scsi*.overlay (each overlay file is a sector-image, can be MB).
+    let cow_path = snap_dir.join("cow.toml");
+    if cow_path.exists() {
+        let cow_bytes = std::fs::read(&cow_path).map_err(|e| format!("read cow.toml: {}", e))?;
+        // Push overlay binaries first, cow.toml last (the index that lists them).
+        if let Ok(text) = std::str::from_utf8(&cow_bytes) {
+            if let Ok(toml::Value::Table(t)) = text.parse::<toml::Value>() {
+                for (key, _) in t {
+                    if let Some(id_str) = key.strip_prefix("scsi") {
+                        if id_str.parse::<usize>().is_ok() {
+                            let fname = format!("{}.overlay", key);
+                            let p = snap_dir.join(&fname);
+                            if p.exists() {
+                                let bytes = std::fs::read(&p)
+                                    .map_err(|e| format!("read {}: {}", fname, e))?;
+                                let url = format!("{}/snapshots/{}/{}", base, name, fname);
+                                http_put(&url, &bytes)
+                                    .map_err(|e| format!("PUT {}: {}", fname, e))?;
+                                report.files_transferred += 1;
+                                report.bytes_transferred += bytes.len() as u64;
+                            }
+                        }
+                    }
+                }
+            }
+        }
+        let url = format!("{}/snapshots/{}/cow.toml", base, name);
+        http_put(&url, &cow_bytes).map_err(|e| format!("PUT cow.toml: {}", e))?;
+        report.files_transferred += 1;
+        report.bytes_transferred += cow_bytes.len() as u64;
+    }
+
+    // 4. chunks.bin (v3+) — uploaded BEFORE manifest because pullers fetch
+    //    it after manifest and would otherwise race a concurrent push.
+    if sv >= 3 {
+        let chunks_bytes = std::fs::read(snap_dir.join("chunks.bin"))
+            .map_err(|e| format!("read chunks.bin: {}", e))?;
+        let url = format!("{}/snapshots/{}/chunks.bin", base, name);
+        http_put(&url, &chunks_bytes).map_err(|e| format!("PUT chunks.bin: {}", e))?;
+        report.files_transferred += 1;
+        report.bytes_transferred += chunks_bytes.len() as u64;
+    }
+
+    // 5. Manifest LAST (commit point).
+    let manifest_bytes = std::fs::read(snap_dir.join("snapshot.toml"))
+        .map_err(|e| format!("read snapshot.toml: {}", e))?;
+    let url = format!("{}/snapshots/{}/snapshot.toml", base, name);
+    http_put(&url, &manifest_bytes).map_err(|e| format!("PUT snapshot.toml: {}", e))?;
+    report.files_transferred += 1;
+    report.bytes_transferred += manifest_bytes.len() as u64;
+
+    Ok(report)
+}
+
+// ---- minimal HTTP/1.1 client over std::net ----
+
+struct ParsedUrl {
+    host: String,
+    port: u16,
+    path: String,
+}
+
+fn parse_url(url: &str) -> Result<ParsedUrl, String> {
+    let rest = url
+        .strip_prefix("http://")
+        .ok_or_else(|| format!("only http:// URLs supported, got {}", url))?;
+    let (host_port, path) = match rest.find('/') {
+        Some(i) => (&rest[..i], &rest[i..]),
+        None => (rest, "/"),
+    };
+    let (host, port) = match host_port.rsplit_once(':') {
+        Some((h, p)) => (h.to_string(), p.parse().map_err(|e| format!("bad port: {}", e))?),
+        None => (host_port.to_string(), 80u16),
+    };
+    Ok(ParsedUrl {
+        host,
+        port,
+        path: path.to_string(),
+    })
+}
+
+fn http_send(method: &str, url: &str, body: Option<&[u8]>) -> Result<(u16, Vec<u8>), String> {
+    let p = parse_url(url)?;
+    let addr = (p.host.as_str(), p.port)
+        .to_socket_addrs()
+        .map_err(|e| format!("resolve {}: {}", p.host, e))?
+        .next()
+        .ok_or_else(|| format!("no addresses for {}", p.host))?;
+    let mut s = TcpStream::connect_timeout(&addr, HTTP_TIMEOUT).map_err(|e| format!("connect: {}", e))?;
+    s.set_read_timeout(Some(HTTP_TIMEOUT)).ok();
+    s.set_write_timeout(Some(HTTP_TIMEOUT)).ok();
+
+    let mut req = Vec::with_capacity(256);
+    write!(req, "{} {} HTTP/1.1\r\nHost: {}:{}\r\nConnection: close\r\n",
+           method, p.path, p.host, p.port).map_err(|e| e.to_string())?;
+    if let Some(b) = body {
+        write!(req, "Content-Length: {}\r\nContent-Type: application/octet-stream\r\n", b.len())
+            .map_err(|e| e.to_string())?;
+    }
+    req.extend_from_slice(b"\r\n");
+    if let Some(b) = body {
+        req.extend_from_slice(b);
+    }
+    s.write_all(&req).map_err(|e| format!("write request: {}", e))?;
+
+    let mut buf = Vec::new();
+    s.read_to_end(&mut buf).map_err(|e| format!("read response: {}", e))?;
+
+    // Parse status + headers + body.
+    let split = buf
+        .windows(4)
+        .position(|w| w == b"\r\n\r\n")
+        .ok_or_else(|| "malformed response: no header terminator".to_string())?;
+    let header = std::str::from_utf8(&buf[..split])
+        .map_err(|e| format!("non-utf8 header: {}", e))?;
+    let body_start = split + 4;
+    let mut lines = header.lines();
+    let status_line = lines.next().ok_or_else(|| "empty response".to_string())?;
+    let status: u16 = status_line
+        .split_whitespace()
+        .nth(1)
+        .and_then(|s| s.parse().ok())
+        .ok_or_else(|| format!("bad status line: {}", status_line))?;
+
+    // Handle Content-Length and Transfer-Encoding: chunked.
+    let mut content_length: Option<usize> = None;
+    let mut chunked = false;
+    for line in lines {
+        if let Some(v) = line.strip_prefix_ignore_case("Content-Length: ") {
+            content_length = v.trim().parse().ok();
+        }
+        if line.eq_ignore_ascii_case("Transfer-Encoding: chunked") {
+            chunked = true;
+        }
+    }
+    let body = if chunked {
+        decode_chunked(&buf[body_start..])?
+    } else if let Some(n) = content_length {
+        buf[body_start..body_start + n.min(buf.len() - body_start)].to_vec()
+    } else {
+        buf[body_start..].to_vec()
+    };
+    Ok((status, body))
+}
+
+trait StripPrefixIgnoreCase {
+    fn strip_prefix_ignore_case(&self, prefix: &str) -> Option<&str>;
+}
+impl StripPrefixIgnoreCase for str {
+    fn strip_prefix_ignore_case(&self, prefix: &str) -> Option<&str> {
+        if self.len() >= prefix.len() && self[..prefix.len()].eq_ignore_ascii_case(prefix) {
+            Some(&self[prefix.len()..])
+        } else {
+            None
+        }
+    }
+}
+
+fn decode_chunked(data: &[u8]) -> Result<Vec<u8>, String> {
+    let mut out = Vec::with_capacity(data.len());
+    let mut i = 0;
+    while i < data.len() {
+        // Read the size line up to \r\n.
+        let crlf = data[i..]
+            .windows(2)
+            .position(|w| w == b"\r\n")
+            .ok_or_else(|| "chunked: missing size CRLF".to_string())?;
+        let size_str = std::str::from_utf8(&data[i..i + crlf])
+            .map_err(|_| "chunked: non-utf8 size line".to_string())?;
+        let size = usize::from_str_radix(size_str.split(';').next().unwrap_or(size_str).trim(), 16)
+            .map_err(|e| format!("chunked: bad size {}: {}", size_str, e))?;
+        i += crlf + 2;
+        if size == 0 {
+            break;
+        }
+        if i + size > data.len() {
+            return Err("chunked: short chunk data".into());
+        }
+        out.extend_from_slice(&data[i..i + size]);
+        i += size + 2; // skip trailing \r\n
+    }
+    Ok(out)
+}
+
+fn http_get(url: &str) -> Result<Vec<u8>, String> {
+    let (status, body) = http_send("GET", url, None)?;
+    if status == 200 {
+        Ok(body)
+    } else {
+        Err(format!("HTTP {} for {}", status, url))
+    }
+}
+
+fn http_head(url: &str) -> Result<bool, String> {
+    let (status, _) = http_send("HEAD", url, None)?;
+    Ok(status == 200)
+}
+
+fn http_put(url: &str, body: &[u8]) -> Result<(), String> {
+    let (status, resp) = http_send("PUT", url, Some(body))?;
+    if (200..300).contains(&status) {
+        Ok(())
+    } else {
+        Err(format!(
+            "HTTP {} for PUT {}: {}",
+            status,
+            url,
+            String::from_utf8_lossy(&resp)
+        ))
+    }
+}
+
+// ---- hex helpers (mirror chunk_store layout) ----
+
+fn hex_of(h: &ChunkHash) -> String {
+    const HEX: &[u8; 16] = b"0123456789abcdef";
+    let mut s = String::with_capacity(64);
+    for &b in h.iter() {
+        s.push(HEX[(b >> 4) as usize] as char);
+        s.push(HEX[(b & 0x0f) as usize] as char);
+    }
+    s
+}
+fn hex2_of(h: &ChunkHash) -> String {
+    let s = hex_of(h);
+    s[..2].to_string()
+}
+fn hex62_of(h: &ChunkHash) -> String {
+    let s = hex_of(h);
+    s[2..].to_string()
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn parse_url_default_port() {
+        let p = parse_url("http://example.com/foo/bar").unwrap();
+        assert_eq!(p.host, "example.com");
+        assert_eq!(p.port, 80);
+        assert_eq!(p.path, "/foo/bar");
+    }
+
+    #[test]
+    fn parse_url_explicit_port() {
+        let p = parse_url("http://localhost:8080/").unwrap();
+        assert_eq!(p.host, "localhost");
+        assert_eq!(p.port, 8080);
+        assert_eq!(p.path, "/");
+    }
+
+    #[test]
+    fn parse_url_no_path() {
+        let p = parse_url("http://localhost:8080").unwrap();
+        assert_eq!(p.path, "/");
+    }
+
+    #[test]
+    fn parse_url_rejects_https() {
+        assert!(parse_url("https://example.com/").is_err());
+    }
+
+    #[test]
+    fn decode_chunked_happy() {
+        let data = b"4\r\nWiki\r\n5\r\npedia\r\n0\r\n\r\n";
+        let out = decode_chunked(data).unwrap();
+        assert_eq!(out, b"Wikipedia");
+    }
+
+    #[test]
+    fn hex_round_trip() {
+        let h: ChunkHash = blake3::hash(b"hello").into();
+        let s = hex_of(&h);
+        assert_eq!(s.len(), 64);
+        assert_eq!(hex2_of(&h).len(), 2);
+        assert_eq!(hex62_of(&h).len(), 62);
+        assert_eq!(format!("{}{}", hex2_of(&h), hex62_of(&h)), s);
+    }
+}
diff --git a/src/rex3.rs b/src/rex3.rs
index db1399c..20ea6c8 100644
--- a/src/rex3.rs
+++ b/src/rex3.rs
@@ -5,7 +5,7 @@ use std::thread;
 use crossbeam_utils::CachePadded;
 use crate::traits::{BusRead8, BusRead16, BusRead32, BusRead64, BUS_OK, BUS_ERR, BusDevice, Device, Resettable, Saveable};
 use crate::devlog::{LogModule, devlog_is_active, devlog};
-use crate::snapshot::{get_field, u32_slice_to_toml, u16_slice_to_toml, load_u32_slice, load_u16_slice, toml_u32, toml_u64, hex_u32, hex_u64};
+use crate::snapshot::{get_field, u32_slice_to_toml, u16_slice_to_toml, u8_slice_to_toml, load_u32_slice, load_u16_slice, load_u8_slice, toml_u32, toml_u64, toml_u8, hex_u32, hex_u64, hex_u8};
 use std::cell::{Cell, UnsafeCell};
 use crate::vc2::Vc2;
 use crate::xmap9::Xmap9;
@@ -3671,6 +3671,28 @@ impl Rex3 {
         Ok(())
     }
 
+    /// Clone the framebuffers (RGB and aux) into native-endian Vec<u32>
+    /// buffers. Pair with `restore_framebuffers_inmem` for the in-memory
+    /// rollback checkpoint; bypasses the byte-shuffle the disk path needs.
+    pub fn snapshot_framebuffers_inmem(&self) -> (Vec<u32>, Vec<u32>) {
+        let rgb = unsafe { &*self.fb_rgb.get() };
+        let aux = unsafe { &*self.fb_aux.get() };
+        (rgb.to_vec(), aux.to_vec())
+    }
+
+    /// Restore framebuffers from buffers captured by
+    /// `snapshot_framebuffers_inmem`. Lengths are clamped to the actual
+    /// framebuffer size.
+    pub fn restore_framebuffers_inmem(&self, rgb: &[u32], aux: &[u32]) {
+        let dst_rgb = unsafe { &mut *self.fb_rgb.get() };
+        let n = rgb.len().min(dst_rgb.len());
+        dst_rgb[..n].copy_from_slice(&rgb[..n]);
+
+        let dst_aux = unsafe { &mut *self.fb_aux.get() };
+        let n = aux.len().min(dst_aux.len());
+        dst_aux[..n].copy_from_slice(&aux[..n]);
+    }
+
     pub fn load_framebuffers(&self, dir: &std::path::Path) -> std::io::Result<()> {
         let path_rgb = dir.join("rex3_rgb.bin");
         if path_rgb.exists() {
@@ -4657,6 +4679,13 @@ impl Saveable for Rex3 {
             tbl.insert("cmap1".into(), save_cmap(&cmap));
         }
 
+        // Bt445 RAMDAC (palette + registers) — missing this makes every
+        // pixel decode to black after restore.
+        {
+            let dac = self.bt445.lock();
+            tbl.insert("bt445".into(), save_bt445(&dac));
+        }
+
         toml::Value::Table(tbl)
     }
 
@@ -4691,6 +4720,7 @@ impl Saveable for Rex3 {
         if let Some(xv) = get_field(v, "xmap1") { load_xmap9(&mut self.xmap1.lock(), xv); }
         if let Some(cv) = get_field(v, "cmap0") { load_cmap(&mut self.cmap0.lock(), cv); }
         if let Some(cv) = get_field(v, "cmap1") { load_cmap(&mut self.cmap1.lock(), cv); }
+        if let Some(dv) = get_field(v, "bt445") { load_bt445(&mut self.bt445.lock(), dv); }
 
         Ok(())
     }
@@ -4732,6 +4762,63 @@ fn load_cmap(cmap: &mut crate::cmap::Cmap, v: &toml::Value) {
     cmap.dirty = true;
 }
 
+// Bt445 RAMDAC: palette + control registers. Critical for snapshot restore
+// because `power_on` wipes the palette to all-zero, which makes every pixel
+// decode to black after the gamma lookup in disp.rs::refresh.
+fn save_bt445(dac: &crate::bt445::Bt445) -> toml::Value {
+    let flatten = |rgb: &[[u8; 3]]| -> Vec<u8> {
+        let mut v = Vec::with_capacity(rgb.len() * 3);
+        for e in rgb { v.extend_from_slice(e); }
+        v
+    };
+    let mut tbl = toml::map::Map::new();
+    tbl.insert("palette".into(),      u8_slice_to_toml(&flatten(&dac.palette)));
+    tbl.insert("overlay".into(),      u8_slice_to_toml(&flatten(&dac.overlay)));
+    tbl.insert("cursor_color".into(), u8_slice_to_toml(&flatten(&dac.cursor_color)));
+    tbl.insert("addr".into(),         hex_u8(dac.addr));
+    tbl.insert("rgb_counter".into(),  hex_u8(dac.rgb_counter));
+    tbl.insert("read_enable".into(),  hex_u8(dac.read_enable));
+    tbl.insert("blink_enable".into(), hex_u8(dac.blink_enable));
+    tbl.insert("cmd0".into(),         hex_u8(dac.cmd0));
+    tbl.insert("rgb_ctrl".into(),     u8_slice_to_toml(&dac.rgb_ctrl));
+    tbl.insert("setup".into(),        u8_slice_to_toml(&dac.setup));
+    toml::Value::Table(tbl)
+}
+
+fn load_bt445(dac: &mut crate::bt445::Bt445, v: &toml::Value) {
+    let unflatten = |bytes: &[u8], dest: &mut [[u8; 3]]| {
+        for (i, chunk) in bytes.chunks(3).enumerate() {
+            if i >= dest.len() { break; }
+            if chunk.len() == 3 {
+                dest[i] = [chunk[0], chunk[1], chunk[2]];
+            }
+        }
+    };
+    if let Some(r) = get_field(v, "palette") {
+        let mut buf = vec![0u8; dac.palette.len() * 3];
+        load_u8_slice(r, &mut buf);
+        unflatten(&buf, &mut dac.palette);
+    }
+    if let Some(r) = get_field(v, "overlay") {
+        let mut buf = vec![0u8; dac.overlay.len() * 3];
+        load_u8_slice(r, &mut buf);
+        unflatten(&buf, &mut dac.overlay);
+    }
+    if let Some(r) = get_field(v, "cursor_color") {
+        let mut buf = vec![0u8; dac.cursor_color.len() * 3];
+        load_u8_slice(r, &mut buf);
+        unflatten(&buf, &mut dac.cursor_color);
+    }
+    if let Some(x) = get_field(v, "addr")         { if let Some(n) = toml_u8(x) { dac.addr = n; } }
+    if let Some(x) = get_field(v, "rgb_counter")  { if let Some(n) = toml_u8(x) { dac.rgb_counter = n; } }
+    if let Some(x) = get_field(v, "read_enable")  { if let Some(n) = toml_u8(x) { dac.read_enable = n; } }
+    if let Some(x) = get_field(v, "blink_enable") { if let Some(n) = toml_u8(x) { dac.blink_enable = n; } }
+    if let Some(x) = get_field(v, "cmd0")         { if let Some(n) = toml_u8(x) { dac.cmd0 = n; } }
+    if let Some(r) = get_field(v, "rgb_ctrl")     { load_u8_slice(r, &mut dac.rgb_ctrl); }
+    if let Some(r) = get_field(v, "setup")        { load_u8_slice(r, &mut dac.setup); }
+    dac.dirty = true;
+}
+
 #[cfg(test)]
 #[path = "rex3_tests.rs"]
 mod tests;
diff --git a/src/scsi.rs b/src/scsi.rs
index f92f8ce..a70dbc7 100644
--- a/src/scsi.rs
+++ b/src/scsi.rs
@@ -167,6 +167,24 @@ impl ScsiDevice {
         }
     }
 
+    /// Copy the COW overlay into `dest` and return its dirty sector set.
+    /// Direct-mode devices return an empty list and create no file.
+    pub fn cow_export(&mut self, dest: &std::path::Path) -> io::Result<Vec<u64>> {
+        match &mut self.backend {
+            DiskBackend::Cow(cow) => cow.export_overlay(dest),
+            DiskBackend::Direct(_) => Ok(Vec::new()),
+        }
+    }
+
+    /// Replace the COW overlay with the contents of `source` and adopt
+    /// `dirty` as the dirty sector set. No-op on direct-mode devices.
+    pub fn cow_import(&mut self, source: &std::path::Path, dirty: Vec<u64>) -> io::Result<()> {
+        match &mut self.backend {
+            DiskBackend::Cow(cow) => cow.import_overlay(source, dirty),
+            DiskBackend::Direct(_) => Ok(()),
+        }
+    }
+
     /// Number of dirty sectors in the COW overlay, or 0 if direct mode.
     pub fn cow_dirty_count(&self) -> usize {
         match &self.backend {
diff --git a/src/seeq8003.rs b/src/seeq8003.rs
index df7d503..84ce9b2 100644
--- a/src/seeq8003.rs
+++ b/src/seeq8003.rs
@@ -805,3 +805,31 @@ impl Saveable for Seeq8003 {
         Ok(())
     }
 }
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    /// Phase 1.7 round-trip: a fresh Seeq loaded from a captured save_state
+    /// must re-serialize byte-identically. Mutates station_addr and the four
+    /// rx/tx command/status registers.
+    #[test]
+    fn save_load_round_trip() {
+        let src = Seeq8003::new(None, None, None, Arc::new(AtomicU64::new(0)));
+        {
+            let mut st = src.state.lock();
+            st.station_addr = [0x08, 0x00, 0x69, 0x12, 0x34, 0x56];
+            st.rx_cmd  = 0x18;
+            st.rx_stat = 0xa1;
+            st.tx_cmd  = 0x40;
+            st.tx_stat = 0x82;
+        }
+        let v1 = src.save_state();
+
+        let dst = Seeq8003::new(None, None, None, Arc::new(AtomicU64::new(0)));
+        dst.load_state(&v1).expect("load_state");
+        let v2 = dst.save_state();
+
+        assert_eq!(v1, v2, "Seeq8003 save_state mismatch after load_state round-trip");
+    }
+}
diff --git a/src/sgi_vh.rs b/src/sgi_vh.rs
new file mode 100644
index 0000000..8e472f8
--- /dev/null
+++ b/src/sgi_vh.rs
@@ -0,0 +1,221 @@
+//! Minimal SGI Volume Header writer for the Phase 2.4 scratch volume.
+//!
+//! IRIX requires a recognisable partition table at sector 0 before the
+//! `/dev/rdsk/dks0dNvol` and `/dev/rdsk/dks0dNvh` device nodes return real
+//! data. Without one IRIX enumerates the SCSI target on `hinv` but every
+//! read returns "I/O error". This module writes a 512-byte SGI Volume Header
+//! into sector 0 of a freshly-created scratch image with two partition
+//! entries:
+//!
+//! - **slot 0 ("payload")**: type 3 (`PT_RAW`), spans sectors 8..end. IRIX
+//!   surfaces this as `/dev/rdsk/dks0dNs0`. This is the partition the host
+//!   injects payload bytes into and that the guest reads — `first_block` is
+//!   honoured so reads from offset 0 of `s0` map to byte 4096 of the disk
+//!   (right after the VH).
+//! - **slot 8 ("vh")**: type 0 (`PT_VOLHDR`), spans sectors 0..7. IRIX
+//!   surfaces this as `/dev/rdsk/dks0dNvh`. Present only so IRIX's standard
+//!   convention is satisfied; the host-side `scratch-write` never touches it.
+//! - **slot 10 ("vol")**: type 6 (`PT_VOLUME`), spans the entire disk. IRIX
+//!   surfaces this as `/dev/rdsk/dks0dNvol`. The `vol` partition by SGI
+//!   convention always covers sector 0 onwards regardless of `first_block`,
+//!   so reading it returns the VH first — use `s0` for payload reads.
+//!
+//! NB: IRIX raw block-device reads must be sector-aligned (multiples of 512
+//! bytes). `dd if=/dev/rdsk/dks0dNs0 bs=512 count=N` works; `bs=64` returns
+//! "Read error: I/O error" with no SCSI-level error.
+//!
+//! Convention: host writes payload at offset `SCRATCH_PAYLOAD_OFFSET` (4096
+//! = sector 8). Guest reads payload from offset 0 of the `vol` partition,
+//! which the kernel maps to sector 8 of the underlying disk.
+//!
+//! All values are big-endian per SGI convention.
+
+use std::fs::File;
+use std::io::{self, Write};
+use std::path::Path;
+
+/// First payload byte. Reserved bytes 0..4095 hold the 8-sector VH partition.
+pub const SCRATCH_PAYLOAD_OFFSET: u64 = 4096;
+
+const SECTOR_SIZE: u64 = 512;
+const VH_SECTORS: u64 = 8;
+const SGI_MAGIC: u32 = 0x0BE5_A941;
+
+const PT_VOLHDR: u32 = 0;
+const PT_RAW:    u32 = 3;
+const PT_VOLUME: u32 = 6;
+
+const PT_TABLE_OFFSET: usize = 0x138;
+const PT_ENTRY_SIZE: usize = 12;
+const CSUM_OFFSET: usize = 0x1F8;
+
+/// Create a fresh scratch image at `path` of `total_bytes` size, with a
+/// minimal SGI Volume Header at sector 0. Overwrites any existing file.
+pub fn create_scratch_image(path: &Path, total_bytes: u64) -> io::Result<()> {
+    if total_bytes < SCRATCH_PAYLOAD_OFFSET + SECTOR_SIZE {
+        return Err(io::Error::new(
+            io::ErrorKind::InvalidInput,
+            format!(
+                "scratch size {} bytes is too small (minimum {} bytes)",
+                total_bytes,
+                SCRATCH_PAYLOAD_OFFSET + SECTOR_SIZE
+            ),
+        ));
+    }
+    if total_bytes % SECTOR_SIZE != 0 {
+        return Err(io::Error::new(
+            io::ErrorKind::InvalidInput,
+            format!("scratch size {} is not a multiple of {} bytes", total_bytes, SECTOR_SIZE),
+        ));
+    }
+
+    let total_sectors = total_bytes / SECTOR_SIZE;
+    let vol_sectors = total_sectors - VH_SECTORS;
+
+    let mut vh = build_vh(vol_sectors);
+    fix_csum(&mut vh);
+
+    let f = File::create(path)?;
+    f.set_len(total_bytes)?;
+    let mut f = f;
+    f.write_all(&vh)?;
+    f.sync_all()?;
+    Ok(())
+}
+
+fn build_vh(vol_sectors: u64) -> [u8; SECTOR_SIZE as usize] {
+    let mut vh = [0u8; SECTOR_SIZE as usize];
+
+    // Magic.
+    vh[0..4].copy_from_slice(&SGI_MAGIC.to_be_bytes());
+
+    // root_partnum / swap_partnum / bootfile / device_parameters all stay 0.
+
+    // Partition table at PT_TABLE_OFFSET (0x138).
+    // Slot 0 ("payload"): type PT_RAW, sectors 8..end. IRIX maps this to
+    // /dev/rdsk/dks0dNs0 with first_block honoured — reads at offset 0 of
+    // s0 land at byte 4096 of the disk (right after the VH).
+    write_pt_entry(&mut vh, 0, vol_sectors as u32, VH_SECTORS as u32, PT_RAW);
+    // Slot 8 ("vh"): type PT_VOLHDR, sectors 0..7. IRIX maps this to
+    // /dev/rdsk/dks0dNvh.
+    write_pt_entry(&mut vh, 8, VH_SECTORS as u32, 0, PT_VOLHDR);
+    // Slot 10 ("vol"): type PT_VOLUME, whole disk. IRIX maps this to
+    // /dev/rdsk/dks0dNvol — convenient for raw whole-disk dumps but always
+    // starts at sector 0 (the VH), so use s0 for payload reads.
+    let total_sectors_u32 = (vol_sectors + VH_SECTORS) as u32;
+    write_pt_entry(&mut vh, 10, total_sectors_u32, 0, PT_VOLUME);
+
+    vh
+}
+
+fn write_pt_entry(vh: &mut [u8; SECTOR_SIZE as usize], slot: usize, nblks: u32, first: u32, ty: u32) {
+    let off = PT_TABLE_OFFSET + slot * PT_ENTRY_SIZE;
+    vh[off..off + 4].copy_from_slice(&nblks.to_be_bytes());
+    vh[off + 4..off + 8].copy_from_slice(&first.to_be_bytes());
+    vh[off + 8..off + 12].copy_from_slice(&ty.to_be_bytes());
+}
+
+/// Set csum so the 32-bit two's-complement sum of all 128 big-endian words
+/// equals zero. fx, prtvtoc, and the IRIX kernel all check this.
+fn fix_csum(vh: &mut [u8; SECTOR_SIZE as usize]) {
+    // Zero the existing csum first, then sum, then store -sum.
+    vh[CSUM_OFFSET..CSUM_OFFSET + 4].fill(0);
+    let mut sum: u32 = 0;
+    for chunk in vh.chunks_exact(4) {
+        let w = u32::from_be_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]);
+        sum = sum.wrapping_add(w);
+    }
+    let csum = (!sum).wrapping_add(1); // -sum
+    vh[CSUM_OFFSET..CSUM_OFFSET + 4].copy_from_slice(&csum.to_be_bytes());
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    fn unique_tmp_path(tag: &str) -> std::path::PathBuf {
+        let nanos = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .map(|d| d.as_nanos())
+            .unwrap_or(0);
+        std::env::temp_dir().join(format!("iris-vh-{}-{}.raw", tag, nanos))
+    }
+
+    #[test]
+    fn scratch_image_has_correct_size_and_magic() {
+        let p = unique_tmp_path("size");
+        let size: u64 = 4 * 1024 * 1024; // 4 MB
+        create_scratch_image(&p, size).expect("create");
+        let meta = std::fs::metadata(&p).unwrap();
+        assert_eq!(meta.len(), size, "image size must match request");
+        let bytes = std::fs::read(&p).unwrap();
+        assert_eq!(&bytes[0..4], &SGI_MAGIC.to_be_bytes(), "missing SGI magic");
+        let _ = std::fs::remove_file(&p);
+    }
+
+    #[test]
+    fn partition_table_describes_vol_and_vh() {
+        let p = unique_tmp_path("pt");
+        let size: u64 = 64 * 1024 * 1024;
+        create_scratch_image(&p, size).expect("create");
+        let bytes = std::fs::read(&p).unwrap();
+
+        // Slot 0 (payload): nblks = total - 8, first = 8, type = PT_RAW.
+        let off0 = PT_TABLE_OFFSET;
+        let nblks = u32::from_be_bytes(bytes[off0..off0 + 4].try_into().unwrap());
+        let first = u32::from_be_bytes(bytes[off0 + 4..off0 + 8].try_into().unwrap());
+        let ty    = u32::from_be_bytes(bytes[off0 + 8..off0 + 12].try_into().unwrap());
+        assert_eq!(nblks, (size / SECTOR_SIZE - VH_SECTORS) as u32);
+        assert_eq!(first, VH_SECTORS as u32);
+        assert_eq!(ty, PT_RAW);
+
+        // Slot 8 (vh): nblks = 8, first = 0, type = PT_VOLHDR.
+        let off8 = PT_TABLE_OFFSET + 8 * PT_ENTRY_SIZE;
+        let nblks = u32::from_be_bytes(bytes[off8..off8 + 4].try_into().unwrap());
+        let first = u32::from_be_bytes(bytes[off8 + 4..off8 + 8].try_into().unwrap());
+        let ty    = u32::from_be_bytes(bytes[off8 + 8..off8 + 12].try_into().unwrap());
+        assert_eq!(nblks, VH_SECTORS as u32);
+        assert_eq!(first, 0);
+        assert_eq!(ty, PT_VOLHDR);
+
+        // Slot 10 (vol): nblks = total, first = 0, type = PT_VOLUME (whole disk).
+        let off10 = PT_TABLE_OFFSET + 10 * PT_ENTRY_SIZE;
+        let nblks = u32::from_be_bytes(bytes[off10..off10 + 4].try_into().unwrap());
+        let first = u32::from_be_bytes(bytes[off10 + 4..off10 + 8].try_into().unwrap());
+        let ty    = u32::from_be_bytes(bytes[off10 + 8..off10 + 12].try_into().unwrap());
+        assert_eq!(nblks, (size / SECTOR_SIZE) as u32);
+        assert_eq!(first, 0);
+        assert_eq!(ty, PT_VOLUME);
+
+        let _ = std::fs::remove_file(&p);
+    }
+
+    #[test]
+    fn checksum_sums_to_zero() {
+        let p = unique_tmp_path("csum");
+        let size: u64 = 64 * 1024 * 1024;
+        create_scratch_image(&p, size).expect("create");
+        let bytes = std::fs::read(&p).unwrap();
+        let mut sum: u32 = 0;
+        for chunk in bytes[..512].chunks_exact(4) {
+            let w = u32::from_be_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]);
+            sum = sum.wrapping_add(w);
+        }
+        assert_eq!(sum, 0, "VH csum must make 32-bit sum of 128 BE words == 0");
+        let _ = std::fs::remove_file(&p);
+    }
+
+    #[test]
+    fn rejects_too_small_image() {
+        let p = unique_tmp_path("small");
+        let r = create_scratch_image(&p, 4096); // exactly VH size, no payload
+        assert!(r.is_err());
+    }
+
+    #[test]
+    fn rejects_non_sector_aligned_size() {
+        let p = unique_tmp_path("misaligned");
+        let r = create_scratch_image(&p, 4096 + 100);
+        assert!(r.is_err());
+    }
+}
diff --git a/src/snapshot.rs b/src/snapshot.rs
index 537ab82..15030b8 100644
--- a/src/snapshot.rs
+++ b/src/snapshot.rs
@@ -1,25 +1,134 @@
 // System Snapshot — save and restore full machine state to/from a directory.
 //
-// Layout of saves/<name>/:
-//   cpu.toml       — CPU core (GPRs, CP0, FPU), TLB entries
-//   mc.toml        — Memory Controller registers + GIO DMA state
-//   ioc.toml       — IOC interrupt registers
-//   hpc3.toml      — HPC3 state register, PBUS PIO, DMA channel registers
-//   rex3.toml      — REX3 drawing registers, VC2, XMAP9, CMAP palette
+// Layout of saves/<name>/ (schema_version = 2):
+//   snapshot.toml  — manifest (always TOML, human-readable)
+//   cpu.bin        — CPU core (GPRs, CP0, FPU), TLB entries  (postcard BinValue)
+//   mc.bin         — Memory Controller registers + GIO DMA state
+//   ioc.bin        — IOC interrupt registers
+//   hpc3.bin       — HPC3 state register, PBUS PIO, DMA channel registers
+//   rex3.bin       — REX3 drawing registers, VC2, XMAP9, CMAP palette
+//   {scc,pit,ps2,rtc,eeprom,scsi,seeq}.bin — peripheral device state
+//   cow.toml       — COW overlay dirty sectors per SCSI device (stays TOML)
 //   bank0.bin      — 128 MB RAM bank A (raw u8, big-endian word layout)
 //   bank1.bin      — 128 MB RAM bank B
 //   bank2.bin      — 128 MB RAM bank C
 //   bank3.bin      — 128 MB RAM bank D
+//
+// schema_version = 1: same layout but device state is *.toml (hex strings).
+// schema_version = 0 (no manifest): legacy, also *.toml.
 
+use serde::{Deserialize, Serialize};
 use std::fs;
 use std::io::{Read, Write};
 use std::path::PathBuf;
 use toml::Value;
 
+/// On-disk schema version for the snapshot directory layout. Bumped when a
+/// device's save_state format changes incompatibly. Old snapshots without a
+/// manifest are treated as v0 (legacy, best-effort load).
+///
+/// v1 → v2: device state moved from *.toml (hex strings, ~80 ms cpu.toml
+/// parse) to *.bin (postcard-encoded BinValue tree, sub-millisecond). Manifest
+/// and cow.toml stay TOML.
+///
+/// v2 → v3: RAM banks and framebuffers moved from raw `bank{N}.bin`/`rex3_*.bin`
+/// files to a content-addressable chunk store at `saves/.cas/`. Each snapshot
+/// writes a tiny `chunks.bin` manifest of per-bank/per-framebuffer chunk
+/// hashes. Two snapshots from the same parent share 95–99% of chunks, so a
+/// fresh save-after-bundle-install costs only the bytes that changed.
+pub const SCHEMA_VERSION: u32 = 3;
+
+const MANIFEST_FILE: &str = "snapshot.toml";
+
 pub struct Snapshot {
     pub dir: PathBuf,
 }
 
+/// Top-level snapshot manifest. Lives at `saves/<name>/snapshot.toml`. Written
+/// first on save and read first on load so the rest of the pipeline can fail
+/// fast with a clear error before reading half a snapshot.
+#[derive(Debug, Clone)]
+pub struct Manifest {
+    pub schema_version: u32,
+    pub iris_git_rev: Option<String>,
+    pub host_arch: String,
+    pub created_at_unix: u64,
+    pub parent: Option<String>,
+    pub description: Option<String>,
+    pub installed_bundles: Vec<String>,
+}
+
+impl Manifest {
+    /// Build a manifest describing the current build/host, with no parent or
+    /// description. Caller can mutate fields before writing.
+    pub fn for_current_save() -> Self {
+        let created_at_unix = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .map(|d| d.as_secs())
+            .unwrap_or(0);
+        Self {
+            schema_version: SCHEMA_VERSION,
+            iris_git_rev: option_env!("IRIS_GIT_REV").map(String::from),
+            host_arch: std::env::consts::ARCH.to_string(),
+            created_at_unix,
+            parent: None,
+            description: None,
+            installed_bundles: Vec::new(),
+        }
+    }
+
+    pub fn to_toml(&self) -> Value {
+        let mut tbl = toml::map::Map::new();
+        tbl.insert("schema_version".into(), Value::Integer(self.schema_version as i64));
+        if let Some(rev) = &self.iris_git_rev {
+            tbl.insert("iris_git_rev".into(), Value::String(rev.clone()));
+        }
+        tbl.insert("host_arch".into(), Value::String(self.host_arch.clone()));
+        tbl.insert("created_at_unix".into(), Value::Integer(self.created_at_unix as i64));
+        if let Some(parent) = &self.parent {
+            tbl.insert("parent".into(), Value::String(parent.clone()));
+        }
+        if let Some(d) = &self.description {
+            tbl.insert("description".into(), Value::String(d.clone()));
+        }
+        let bundles: Vec<Value> = self.installed_bundles.iter()
+            .map(|s| Value::String(s.clone())).collect();
+        tbl.insert("installed_bundles".into(), Value::Array(bundles));
+        Value::Table(tbl)
+    }
+
+    pub fn from_toml(v: &Value) -> Result<Self, String> {
+        let tbl = v.as_table().ok_or("manifest: not a table")?;
+        let schema_version = tbl.get("schema_version")
+            .and_then(|x| x.as_integer())
+            .ok_or("manifest: missing schema_version")? as u32;
+        let host_arch = tbl.get("host_arch")
+            .and_then(|x| x.as_str())
+            .ok_or("manifest: missing host_arch")?
+            .to_string();
+        let created_at_unix = tbl.get("created_at_unix")
+            .and_then(|x| x.as_integer())
+            .map(|i| i as u64)
+            .unwrap_or(0);
+        let iris_git_rev = tbl.get("iris_git_rev").and_then(|x| x.as_str()).map(String::from);
+        let parent = tbl.get("parent").and_then(|x| x.as_str()).map(String::from);
+        let description = tbl.get("description").and_then(|x| x.as_str()).map(String::from);
+        let installed_bundles = tbl.get("installed_bundles")
+            .and_then(|x| x.as_array())
+            .map(|arr| arr.iter().filter_map(|x| x.as_str().map(String::from)).collect())
+            .unwrap_or_default();
+        Ok(Self {
+            schema_version,
+            iris_git_rev,
+            host_arch,
+            created_at_unix,
+            parent,
+            description,
+            installed_bundles,
+        })
+    }
+}
+
 impl Snapshot {
     pub fn new(dir: impl Into<PathBuf>) -> Self {
         Self { dir: dir.into() }
@@ -56,9 +165,183 @@ impl Snapshot {
         fs::read(path)
     }
 
+    /// Postcard-encode a `toml::Value` (via the tagged `BinValue` mirror) and
+    /// write it as `<name>`. Sub-millisecond for typical device tables vs ~80
+    /// ms TOML parse on cpu.toml.
+    pub fn write_value_bin(&self, name: &str, v: &Value) -> std::io::Result<()> {
+        let bv = BinValue::from_toml(v);
+        let bytes = postcard::to_allocvec(&bv)
+            .map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e))?;
+        self.write_bin(name, &bytes)
+    }
+
+    /// Inverse of `write_value_bin`. Returns the reconstructed `toml::Value`.
+    pub fn read_value_bin(&self, name: &str) -> std::io::Result<Value> {
+        let bytes = self.read_bin(name)?;
+        let bv: BinValue = postcard::from_bytes(&bytes)
+            .map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e))?;
+        Ok(bv.into_toml())
+    }
+
+    /// Postcard-encode a `ChunksManifest` (v3+ snapshots).
+    pub fn write_chunks_manifest(&self, m: &ChunksManifest) -> std::io::Result<()> {
+        let bytes = postcard::to_allocvec(m)
+            .map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e))?;
+        self.write_bin("chunks.bin", &bytes)
+    }
+
+    pub fn read_chunks_manifest(&self) -> std::io::Result<ChunksManifest> {
+        let bytes = self.read_bin("chunks.bin")?;
+        postcard::from_bytes(&bytes)
+            .map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e))
+    }
+
+    /// Write a device save_state value, picking `<base>.bin` for v2+ and
+    /// `<base>.toml` for legacy schemas. Centralizes the per-call branching
+    /// in machine.rs.
+    pub fn write_state(&self, base: &str, v: &Value, schema_version: u32) -> std::io::Result<()> {
+        if schema_version >= 2 {
+            self.write_value_bin(&format!("{}.bin", base), v)
+        } else {
+            self.write_toml(&format!("{}.toml", base), v)
+        }
+    }
+
+    /// Read a device save_state value. For v2+ tries `<base>.bin` first and
+    /// falls back to `<base>.toml` for snapshots half-migrated by external
+    /// tooling. For legacy schemas reads `<base>.toml` directly.
+    pub fn read_state(&self, base: &str, schema_version: u32) -> std::io::Result<Value> {
+        if schema_version >= 2 {
+            match self.read_value_bin(&format!("{}.bin", base)) {
+                Ok(v) => Ok(v),
+                Err(_) => self.read_toml(&format!("{}.toml", base)),
+            }
+        } else {
+            self.read_toml(&format!("{}.toml", base))
+        }
+    }
+
     pub fn ensure_dir(&self) -> std::io::Result<()> {
         fs::create_dir_all(&self.dir)
     }
+
+    /// Write the manifest to `snapshot.toml`. Always called first on save.
+    pub fn write_manifest(&self, m: &Manifest) -> std::io::Result<()> {
+        self.write_toml(MANIFEST_FILE, &m.to_toml())
+    }
+
+    /// Read the manifest. Returns `Ok(None)` if `snapshot.toml` is absent
+    /// (legacy snapshots taken before this format was introduced).
+    pub fn read_manifest(&self) -> Result<Option<Manifest>, String> {
+        let path = self.dir.join(MANIFEST_FILE);
+        if !path.exists() {
+            return Ok(None);
+        }
+        let v = self.read_toml(MANIFEST_FILE).map_err(|e| e.to_string())?;
+        Manifest::from_toml(&v).map(Some)
+    }
+}
+
+// ---- ChunksManifest: per-bank / per-framebuffer chunk hash lists (v3+) ----
+
+use crate::chunk_store::ChunkHash;
+
+/// Per-snapshot pointer into the content-addressable chunk store. Every bank
+/// and (optionally) each framebuffer is split into 64 KB chunks; this
+/// manifest records the BLAKE3 hash of each chunk in order. Loading a
+/// snapshot fetches the chunks and concatenates them back into the bank's
+/// big-endian byte stream.
+///
+/// Stored as `chunks.bin` in the snapshot dir, postcard-encoded.
+#[derive(Debug, Clone, Serialize, Deserialize, Default)]
+pub struct ChunksManifest {
+    /// One entry per RAM bank (0..3). Empty inner Vec means the bank wasn't
+    /// captured (e.g. zero-sized in this configuration).
+    pub bank_chunks: [Vec<ChunkHash>; 4],
+    /// REX3 framebuffer chunks: (rgb, aux). `None` when running headless.
+    pub framebuffer_chunks: Option<(Vec<ChunkHash>, Vec<ChunkHash>)>,
+}
+
+impl ChunksManifest {
+    /// Iterate every chunk hash referenced by this manifest. Used by `gc`
+    /// to build the live set across all kept snapshots.
+    pub fn referenced_hashes(&self) -> impl Iterator<Item = &ChunkHash> {
+        self.bank_chunks.iter().flatten().chain(
+            self.framebuffer_chunks
+                .iter()
+                .flat_map(|(rgb, aux)| rgb.iter().chain(aux.iter())),
+        )
+    }
+}
+
+// ---- BinValue: tagged binary mirror of toml::Value ----
+//
+// Postcard is non-self-describing — it cannot deserialize directly into the
+// untagged `toml::Value` enum (which relies on `deserialize_any`). BinValue
+// carries an explicit variant tag so postcard can round-trip it. The
+// conversion to/from `toml::Value` is a single tree walk and runs in low
+// milliseconds even for the largest device tables.
+//
+// Datetime is rare in our save_state output — encode it as an ISO-8601 string
+// and reparse on the way back. If parsing fails the value falls back to a
+// plain `toml::Value::String` so a malformed datetime never panics a load.
+
+/// Tagged binary mirror of `toml::Value`. Order-preserving for tables (matches
+/// `toml::Value::Table` which uses an `IndexMap` under the hood).
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub enum BinValue {
+    String(String),
+    Integer(i64),
+    Float(f64),
+    Boolean(bool),
+    Array(Vec<BinValue>),
+    Table(Vec<(String, BinValue)>),
+    Datetime(String),
+}
+
+impl BinValue {
+    pub fn from_toml(v: &Value) -> Self {
+        match v {
+            Value::String(s) => BinValue::String(s.clone()),
+            Value::Integer(i) => BinValue::Integer(*i),
+            Value::Float(f) => BinValue::Float(*f),
+            Value::Boolean(b) => BinValue::Boolean(*b),
+            Value::Array(arr) => {
+                BinValue::Array(arr.iter().map(BinValue::from_toml).collect())
+            }
+            Value::Table(tbl) => {
+                let mut out = Vec::with_capacity(tbl.len());
+                for (k, v) in tbl {
+                    out.push((k.clone(), BinValue::from_toml(v)));
+                }
+                BinValue::Table(out)
+            }
+            Value::Datetime(dt) => BinValue::Datetime(dt.to_string()),
+        }
+    }
+
+    pub fn into_toml(self) -> Value {
+        match self {
+            BinValue::String(s) => Value::String(s),
+            BinValue::Integer(i) => Value::Integer(i),
+            BinValue::Float(f) => Value::Float(f),
+            BinValue::Boolean(b) => Value::Boolean(b),
+            BinValue::Array(arr) => {
+                Value::Array(arr.into_iter().map(BinValue::into_toml).collect())
+            }
+            BinValue::Table(entries) => {
+                let mut tbl = toml::map::Map::new();
+                for (k, v) in entries {
+                    tbl.insert(k, v.into_toml());
+                }
+                Value::Table(tbl)
+            }
+            BinValue::Datetime(s) => match s.parse::<toml::value::Datetime>() {
+                Ok(dt) => Value::Datetime(dt),
+                Err(_) => Value::String(s),
+            },
+        }
+    }
 }
 
 // ---- scalar hex helpers ----
@@ -182,3 +465,243 @@ pub fn load_u8_slice(v: &Value, dst: &mut [u8]) {
 pub fn get_field<'a>(table: &'a Value, key: &str) -> Option<&'a Value> {
     table.as_table()?.get(key)
 }
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    fn unique_tmp_dir(tag: &str) -> PathBuf {
+        let nanos = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .map(|d| d.as_nanos())
+            .unwrap_or(0);
+        let p = std::env::temp_dir().join(format!("iris-snap-test-{}-{}", tag, nanos));
+        fs::create_dir_all(&p).unwrap();
+        p
+    }
+
+    #[test]
+    fn manifest_round_trip_full() {
+        let m = Manifest {
+            schema_version: 1,
+            iris_git_rev: Some("abc123".into()),
+            host_arch: "aarch64".into(),
+            created_at_unix: 1_700_000_000,
+            parent: Some("base/desktop".into()),
+            description: Some("post mogrix install".into()),
+            installed_bundles: vec!["grep-2.5.4".into(), "sed-4.2.2".into()],
+        };
+        let v = m.to_toml();
+        let m2 = Manifest::from_toml(&v).expect("parse");
+        assert_eq!(m2.schema_version, m.schema_version);
+        assert_eq!(m2.iris_git_rev, m.iris_git_rev);
+        assert_eq!(m2.host_arch, m.host_arch);
+        assert_eq!(m2.created_at_unix, m.created_at_unix);
+        assert_eq!(m2.parent, m.parent);
+        assert_eq!(m2.description, m.description);
+        assert_eq!(m2.installed_bundles, m.installed_bundles);
+    }
+
+    #[test]
+    fn manifest_round_trip_minimal() {
+        let m = Manifest {
+            schema_version: 1,
+            iris_git_rev: None,
+            host_arch: "x86_64".into(),
+            created_at_unix: 0,
+            parent: None,
+            description: None,
+            installed_bundles: vec![],
+        };
+        let v = m.to_toml();
+        let m2 = Manifest::from_toml(&v).expect("parse");
+        assert!(m2.iris_git_rev.is_none());
+        assert!(m2.parent.is_none());
+        assert!(m2.description.is_none());
+        assert!(m2.installed_bundles.is_empty());
+    }
+
+    #[test]
+    fn manifest_rejects_missing_schema_version() {
+        let mut tbl = toml::map::Map::new();
+        tbl.insert("host_arch".into(), Value::String("aarch64".into()));
+        let v = Value::Table(tbl);
+        assert!(Manifest::from_toml(&v).is_err());
+    }
+
+    #[test]
+    fn manifest_disk_round_trip() {
+        let dir = unique_tmp_dir("manifest");
+        let snap = Snapshot::new(&dir);
+        let m = Manifest::for_current_save();
+        snap.write_manifest(&m).expect("write");
+        let loaded = snap.read_manifest().expect("read").expect("present");
+        assert_eq!(loaded.schema_version, SCHEMA_VERSION);
+        assert_eq!(loaded.host_arch, std::env::consts::ARCH);
+        // cleanup
+        let _ = fs::remove_dir_all(&dir);
+    }
+
+    #[test]
+    fn manifest_absent_returns_none() {
+        let dir = unique_tmp_dir("missing");
+        let snap = Snapshot::new(&dir);
+        let loaded = snap.read_manifest().expect("read");
+        assert!(loaded.is_none());
+        let _ = fs::remove_dir_all(&dir);
+    }
+
+    #[test]
+    fn for_current_save_uses_runtime_arch() {
+        let m = Manifest::for_current_save();
+        assert_eq!(m.schema_version, SCHEMA_VERSION);
+        assert_eq!(m.host_arch, std::env::consts::ARCH);
+        assert!(m.parent.is_none());
+    }
+
+    fn sample_value() -> Value {
+        // Mirrors a slice of cpu.toml: top-level scalars + a sub-table with
+        // mixed integer/string/array entries. Order matters for the table
+        // round-trip assertion.
+        let mut cp0 = toml::map::Map::new();
+        cp0.insert("cp0_index".into(), Value::String("0x00000001".into()));
+        cp0.insert("cp0_count".into(), Value::String("0x000000000badf00d".into()));
+        cp0.insert("cp0_status".into(), Value::Integer(0x4040_0000));
+        let mut tbl = toml::map::Map::new();
+        tbl.insert("pc".into(), Value::String("0x9fc00000".into()));
+        tbl.insert(
+            "gpr".into(),
+            Value::Array(vec![
+                Value::String("0x0000000000000000".into()),
+                Value::String("0x0000000000000001".into()),
+                Value::String("0xffffffff80001234".into()),
+            ]),
+        );
+        tbl.insert("cp0".into(), Value::Table(cp0));
+        tbl.insert("running".into(), Value::Boolean(true));
+        tbl.insert("ratio".into(), Value::Float(1.5));
+        Value::Table(tbl)
+    }
+
+    #[test]
+    fn binvalue_round_trip_matches_toml() {
+        let v = sample_value();
+        let bv = BinValue::from_toml(&v);
+        let back = bv.into_toml();
+        assert_eq!(back, v);
+    }
+
+    #[test]
+    fn binvalue_postcard_round_trip() {
+        let v = sample_value();
+        let bv = BinValue::from_toml(&v);
+        let bytes = postcard::to_allocvec(&bv).expect("encode");
+        let bv2: BinValue = postcard::from_bytes(&bytes).expect("decode");
+        assert_eq!(bv2.into_toml(), v);
+    }
+
+    #[test]
+    fn write_state_v2_writes_bin_and_reads_back() {
+        let dir = unique_tmp_dir("state-v2");
+        let snap = Snapshot::new(&dir);
+        let v = sample_value();
+        snap.write_state("cpu", &v, 2).expect("write v2");
+        assert!(dir.join("cpu.bin").exists(), "expected cpu.bin to be written");
+        assert!(!dir.join("cpu.toml").exists(), "v2 must not write cpu.toml");
+        let back = snap.read_state("cpu", 2).expect("read v2");
+        assert_eq!(back, v);
+        let _ = fs::remove_dir_all(&dir);
+    }
+
+    #[test]
+    fn write_state_v1_writes_toml_and_reads_back() {
+        let dir = unique_tmp_dir("state-v1");
+        let snap = Snapshot::new(&dir);
+        let v = sample_value();
+        snap.write_state("cpu", &v, 1).expect("write v1");
+        assert!(dir.join("cpu.toml").exists(), "expected cpu.toml to be written");
+        assert!(!dir.join("cpu.bin").exists(), "v1 must not write cpu.bin");
+        let back = snap.read_state("cpu", 1).expect("read v1");
+        assert_eq!(back, v);
+        let _ = fs::remove_dir_all(&dir);
+    }
+
+    #[test]
+    fn read_state_v2_falls_back_to_toml_when_bin_missing() {
+        // External tooling may legitimately produce a v2 manifest with .toml
+        // device files (e.g. dump-and-edit workflow). Loader must be tolerant.
+        let dir = unique_tmp_dir("state-fallback");
+        let snap = Snapshot::new(&dir);
+        let v = sample_value();
+        snap.write_toml("cpu.toml", &v).expect("write toml");
+        let back = snap.read_state("cpu", 2).expect("read with fallback");
+        assert_eq!(back, v);
+        let _ = fs::remove_dir_all(&dir);
+    }
+
+    /// Hand-runnable bench: `cargo test --release --features lightning -- --ignored bench_cpu_toml_vs_bin --nocapture`.
+    /// Reads saves/working/cpu.toml (3.6 MB legacy snapshot) and prints the
+    /// parse-time delta between toml::from_str and postcard::from_bytes.
+    #[test]
+    #[ignore]
+    fn bench_cpu_toml_vs_bin() {
+        let path = "saves/working/cpu.toml";
+        let s = match std::fs::read_to_string(path) {
+            Ok(s) => s,
+            Err(e) => {
+                eprintln!("skipping: cannot read {}: {}", path, e);
+                return;
+            }
+        };
+        println!("cpu.toml: {} bytes", s.len());
+
+        let runs = 5;
+        let mut toml_total_us = 0u128;
+        let mut toml_v: Option<Value> = None;
+        for _ in 0..runs {
+            let t = std::time::Instant::now();
+            toml_v = Some(toml::from_str::<Value>(&s).unwrap());
+            toml_total_us += t.elapsed().as_micros();
+        }
+        println!("toml::from_str avg over {} runs: {:.2} ms",
+                 runs, toml_total_us as f64 / runs as f64 / 1000.0);
+
+        let v = toml_v.take().unwrap();
+        let bv = BinValue::from_toml(&v);
+        let bytes = postcard::to_allocvec(&bv).unwrap();
+        println!("postcard encoded: {} bytes (vs toml {} bytes, ratio {:.2}x)",
+                 bytes.len(), s.len(), s.len() as f64 / bytes.len() as f64);
+
+        let mut bin_total_us = 0u128;
+        for _ in 0..runs {
+            let t = std::time::Instant::now();
+            let bv: BinValue = postcard::from_bytes(&bytes).unwrap();
+            let _ = bv.into_toml();
+            bin_total_us += t.elapsed().as_micros();
+        }
+        println!("postcard decode + into_toml avg over {} runs: {:.2} ms",
+                 runs, bin_total_us as f64 / runs as f64 / 1000.0);
+        println!("speedup: {:.1}x",
+                 toml_total_us as f64 / bin_total_us as f64);
+    }
+
+    #[test]
+    fn binvalue_payload_is_smaller_than_toml() {
+        // Sanity check the size win on a representative-ish payload.
+        let mut tbl = toml::map::Map::new();
+        let big_arr: Vec<Value> = (0..1024)
+            .map(|i| Value::String(format!("0x{:016x}", i as u64)))
+            .collect();
+        tbl.insert("gpr_big".into(), Value::Array(big_arr));
+        let v = Value::Table(tbl);
+        let toml_bytes = toml::to_string(&v).unwrap().into_bytes();
+        let bv = BinValue::from_toml(&v);
+        let bin_bytes = postcard::to_allocvec(&bv).unwrap();
+        assert!(
+            bin_bytes.len() < toml_bytes.len(),
+            "bin {} bytes should be smaller than toml {} bytes",
+            bin_bytes.len(),
+            toml_bytes.len()
+        );
+    }
+}
diff --git a/src/validate.rs b/src/validate.rs
new file mode 100644
index 0000000..45287f6
--- /dev/null
+++ b/src/validate.rs
@@ -0,0 +1,88 @@
+//! Phase 3.3: snapshot determinism validator.
+//!
+//! Loads a saved snapshot twice, runs the CPU `n` instructions inline (with
+//! all peripheral threads stopped to eliminate scheduling jitter), and
+//! diffs the resulting CPU state digests. Two passes over the same starting
+//! state should produce bit-identical CPU registers — any divergence points
+//! at non-determinism in `load_snapshot` (host wallclock leakage, missing
+//! `load_state` field, uninitialised structure) that would silently corrupt
+//! mogrix CI replays.
+//!
+//! With JIT descoped (Phase 2.5), the original "JIT vs interp lockstep"
+//! framing is gone; this is the snapshot-determinism portion. Peripheral
+//! threads are stopped during the test so device-side timing variance
+//! doesn't leak into the result.
+
+use crate::machine::Machine;
+use crate::mips_exec::CpuStateDigest;
+
+/// Result of `validate_snapshot_determinism`.
+#[derive(Debug)]
+pub struct DeterminismReport {
+    pub instructions_run: u64,
+    pub deterministic: bool,
+    /// Per-field divergence list. Empty when `deterministic` is true.
+    pub diffs: Vec<(String, String, String)>,
+    pub state_a: CpuStateDigest,
+    pub state_b: CpuStateDigest,
+}
+
+impl DeterminismReport {
+    pub fn summary(&self) -> String {
+        if self.deterministic {
+            format!(
+                "deterministic for {} instructions (PC=0x{:016x})",
+                self.instructions_run, self.state_a.pc
+            )
+        } else {
+            let mut s = format!(
+                "DIVERGED after {} instructions ({} field(s)):",
+                self.instructions_run,
+                self.diffs.len()
+            );
+            for (field, a, b) in &self.diffs {
+                s.push_str(&format!("\n  {}: A={} B={}", field, a, b));
+            }
+            s
+        }
+    }
+}
+
+/// Run two passes of `load_snapshot(name); step n; capture` and diff the
+/// resulting CPU state digests. Side effects: leaves the machine stopped
+/// after both passes, with the second-pass state loaded. Caller is
+/// responsible for any subsequent `start`/`restart_peripherals`.
+pub fn validate_snapshot_determinism(
+    machine: &mut Machine,
+    name: &str,
+    n_instructions: u64,
+) -> Result<DeterminismReport, String> {
+    // Pass A: load with everything paused → step inline → capture.
+    // load_snapshot_paused leaves CPU and peripheral threads stopped, so no
+    // thread runs between load and digest. This is the key to surfacing
+    // genuine load_state determinism issues vs. thread-scheduling jitter.
+    machine.load_snapshot_paused(name)?;
+    let executed_a = machine.cpu_step_n_inline(n_instructions)?;
+    let state_a = machine.cpu_state_digest()?;
+
+    // Pass B: same starting snapshot, fresh load.
+    machine.load_snapshot_paused(name)?;
+    let executed_b = machine.cpu_step_n_inline(n_instructions)?;
+    let state_b = machine.cpu_state_digest()?;
+
+    if executed_a != executed_b {
+        return Err(format!(
+            "step counts disagree: A ran {}, B ran {} (CPU stopped itself differently)",
+            executed_a, executed_b
+        ));
+    }
+
+    let diffs = state_a.diff(&state_b);
+    Ok(DeterminismReport {
+        instructions_run: executed_a,
+        deterministic: diffs.is_empty(),
+        diffs,
+        state_a,
+        state_b,
+    })
+}
diff --git a/src/wd33c93a.rs b/src/wd33c93a.rs
index c4a6a8c..bf3e23b 100644
--- a/src/wd33c93a.rs
+++ b/src/wd33c93a.rs
@@ -234,12 +234,27 @@ impl Wd33c93a {
     /// For CD-ROMs, `discs` is the full ordered list of ISO paths; the first
     /// entry is mounted immediately.  For HDDs `discs` is ignored — only
     /// `path` is used.
-    pub fn add_device(&self, id: usize, path: &str, is_cdrom: bool, discs: Vec<String>, overlay: bool) -> std::io::Result<()> {
+    ///
+    /// If `overlay_path_override` is `Some`, it specifies where the COW
+    /// overlay file lives. This lets CI mode isolate its overlay from an
+    /// interactive session sharing the same base image. Ignored when
+    /// `overlay` is false.
+    pub fn add_device(
+        &self,
+        id: usize,
+        path: &str,
+        is_cdrom: bool,
+        discs: Vec<String>,
+        overlay: bool,
+        overlay_path_override: Option<&str>,
+    ) -> std::io::Result<()> {
         use crate::cow_disk::CowDisk;
         use crate::scsi::DiskBackend;
 
         let (backend, size) = if overlay && !is_cdrom {
-            let overlay_path = format!("{}.overlay", path);
+            let overlay_path = overlay_path_override
+                .map(|s| s.to_string())
+                .unwrap_or_else(|| format!("{}.overlay", path));
             let cow = CowDisk::new(path, &overlay_path)?;
             let sz = cow.size();
             (DiskBackend::Cow(cow), sz)
@@ -317,6 +332,59 @@ impl Wd33c93a {
             .collect()
     }
 
+    /// Reset the COW overlay on every attached device that's using COW.
+    /// Direct-mode devices are left alone. Used by `Machine::ci_restore`.
+    pub fn reset_all_overlays(&self) -> Vec<(usize, std::io::Result<()>)> {
+        let mut state = self.state.lock();
+        let mut results = Vec::new();
+        for id in 0..8 {
+            if let Some(dev) = &mut state.devices[id] {
+                if dev.is_cow() {
+                    results.push((id, dev.cow_reset()));
+                }
+            }
+        }
+        results
+    }
+
+    /// Copy every COW overlay into `dir` as `scsi<id>.overlay`. Returns a
+    /// list of `(id, dirty_sector_list)` entries so snapshot save can
+    /// persist the dirty set alongside the raw overlay bytes.
+    pub fn export_overlays(&self, dir: &std::path::Path) -> std::io::Result<Vec<(usize, Vec<u64>)>> {
+        let mut state = self.state.lock();
+        let mut out = Vec::new();
+        for id in 0..8 {
+            if let Some(dev) = &mut state.devices[id] {
+                if dev.is_cow() {
+                    let dest = dir.join(format!("scsi{}.overlay", id));
+                    let dirty = dev.cow_export(&dest)?;
+                    out.push((id, dirty));
+                }
+            }
+        }
+        Ok(out)
+    }
+
+    /// Replace each COW overlay with its saved counterpart in `dir` and
+    /// adopt the matching dirty sector set. Devices with no corresponding
+    /// entry in `dirty_sets` keep their current overlay untouched.
+    pub fn import_overlays(
+        &self,
+        dir: &std::path::Path,
+        dirty_sets: &[(usize, Vec<u64>)],
+    ) -> std::io::Result<()> {
+        let mut state = self.state.lock();
+        for (id, dirty) in dirty_sets {
+            if let Some(dev) = &mut state.devices[*id] {
+                if dev.is_cow() {
+                    let src = dir.join(format!("scsi{}.overlay", id));
+                    dev.cow_import(&src, dirty.clone())?;
+                }
+            }
+        }
+        Ok(())
+    }
+
     pub fn read_fifo(&self) -> u8 {
         let mut state = self.state.lock();
         state.fifo.pop_front().unwrap_or(0)
@@ -1375,4 +1443,42 @@ impl Wd33c93aState {
         self.regs[regs::TRANSFER_COUNT_2ND as usize] = ((count >> 8)  & 0xFF) as u8;
         self.regs[regs::TRANSFER_COUNT_LSB as usize] = (count         & 0xFF) as u8;
     }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    fn make_scsi() -> Wd33c93a {
+        Wd33c93a::new(None, None, Arc::new(AtomicU64::new(0)))
+    }
+
+    /// Phase 1.7 round-trip: a fresh SCSI controller loaded from a captured
+    /// save_state must re-serialize byte-identically. Mutates regs and the
+    /// scalar shadow fields (ar, asr, target_id, pending_*).
+    #[test]
+    fn save_load_round_trip() {
+        let src = make_scsi();
+        {
+            let mut s = src.state.lock();
+            s.regs[regs::CONTROL as usize]      = 0x60;
+            s.regs[regs::SCSI_STATUS as usize]  = 0x10;
+            s.regs[regs::COMMAND_PHASE as usize] = 0x46;
+            s.regs[regs::OWN_ID as usize]       = 0x07;
+            s.ar = 0x42;
+            s.asr = 0x10;
+            s.data_direction_in = true;
+            s.target_id = 4;
+            s.pending_status = 0x02;
+            s.pending_msg = 0x80;
+            s.advanced_mode = true;
+        }
+        let v1 = src.save_state();
+
+        let dst = make_scsi();
+        dst.load_state(&v1).expect("load_state");
+        let v2 = dst.save_state();
+
+        assert_eq!(v1, v2, "Wd33c93a save_state mismatch after load_state round-trip");
+    }
 }
\ No newline at end of file
diff --git a/src/z85c30.rs b/src/z85c30.rs
index d0f3cad..ee44833 100644
--- a/src/z85c30.rs
+++ b/src/z85c30.rs
@@ -320,6 +320,17 @@ pub trait SerialBackend: Send + Sync {
     fn recv_byte(&self) -> io::Result<u8>;
 }
 
+/// Drops TX bytes and never yields RX. Used as a placeholder when a channel
+/// isn't wired to a host I/O source (e.g. CI mode unused channel).
+struct NullBackend;
+
+impl SerialBackend for NullBackend {
+    fn send_byte(&self, _byte: u8) {}
+    fn recv_byte(&self) -> io::Result<u8> {
+        Err(io::Error::new(io::ErrorKind::WouldBlock, "null"))
+    }
+}
+
 #[cfg(unix)]
 struct UnixSocketBackend {
     listener: UnixListener,
@@ -456,28 +467,68 @@ impl SerialBackend for TcpSocketBackend {
 pub struct Z85c30 {
     pub channel_a: Arc<(Mutex<Channel>, Condvar)>,
     pub channel_b: Arc<(Mutex<Channel>, Condvar)>,
-    backend_a: Arc<dyn SerialBackend>,
-    backend_b: Arc<dyn SerialBackend>,
+    // Swappable so CI mode can replace the default TCP backend with a
+    // `CiSerialBackend` before `start()` is called. Wrapped in `Arc<Mutex<_>>`
+    // so `Z85c30` stays `Clone` and the swap is thread-safe.
+    backend_a: Arc<Mutex<Arc<dyn SerialBackend>>>,
+    backend_b: Arc<Mutex<Arc<dyn SerialBackend>>>,
     running: Arc<AtomicBool>,
     threads: Arc<Mutex<Vec<thread::JoinHandle<()>>>>,
 }
 
 impl Z85c30 {
+    /// Default constructor: binds TCP serial backends on 127.0.0.1:8880
+    /// (channel A / tty2) and 127.0.0.1:8881 (channel B / tty1).
     pub fn new(callback: Option<Arc<dyn IrqCallback>>) -> Self {
+        Self::new_inner(callback, true)
+    }
+
+    /// CI-mode constructor: uses null backends instead of binding TCP. The
+    /// caller is expected to install real backends via `set_backend_a` /
+    /// `set_backend_b` before the first `start()`. Avoids port conflicts
+    /// when multiple `--ci` instances run in parallel.
+    pub fn new_null(callback: Option<Arc<dyn IrqCallback>>) -> Self {
+        Self::new_inner(callback, false)
+    }
+
+    fn new_inner(callback: Option<Arc<dyn IrqCallback>>, bind_tcp: bool) -> Self {
         let ip_a = Arc::new(AtomicU8::new(0));
         let ip_b = Arc::new(AtomicU8::new(0));
 
+        let (backend_a, backend_b): (Arc<dyn SerialBackend>, Arc<dyn SerialBackend>) = if bind_tcp {
+            (
+                Arc::new(TcpSocketBackend::new("127.0.0.1:8880")),
+                Arc::new(TcpSocketBackend::new("127.0.0.1:8881")),
+            )
+        } else {
+            (Arc::new(NullBackend), Arc::new(NullBackend))
+        };
+
         Self {
             channel_a: Arc::new((Mutex::new(Channel::new("A", ip_a.clone(), ip_b.clone(), callback.clone())), Condvar::new())),
             // Note: Channel B gets ip_b as its 'num' and ip_a as 'other'
             channel_b: Arc::new((Mutex::new(Channel::new("B", ip_b, ip_a, callback)), Condvar::new())),
-            backend_a: Arc::new(TcpSocketBackend::new("127.0.0.1:8880")),
-            backend_b: Arc::new(TcpSocketBackend::new("127.0.0.1:8881")),
+            backend_a: Arc::new(Mutex::new(backend_a)),
+            backend_b: Arc::new(Mutex::new(backend_b)),
             running: Arc::new(AtomicBool::new(false)),
             threads: Arc::new(Mutex::new(Vec::new())),
         }
     }
 
+    /// Swap in an alternate backend for channel A (tty2 on Indy).
+    /// Must be called before `start()` — running RX/TX threads cache the
+    /// backend Arc at spawn time and will not observe the new one until
+    /// they are stopped and restarted.
+    pub fn set_backend_a(&self, backend: Arc<dyn SerialBackend>) {
+        *self.backend_a.lock() = backend;
+    }
+
+    /// Swap in an alternate backend for channel B (tty1, the PROM/IRIX
+    /// serial console on Indy). Same constraint as `set_backend_a`.
+    pub fn set_backend_b(&self, backend: Arc<dyn SerialBackend>) {
+        *self.backend_b.lock() = backend;
+    }
+
     pub fn read_a_control(&self) -> u8 { 
         let mut a = self.channel_a.0.lock();
         if a.reg_ptr == 2 {
@@ -610,8 +661,8 @@ impl Device for Z85c30 {
         }
 
         let pairs = [
-            (self.channel_a.clone(), self.backend_a.clone()),
-            (self.channel_b.clone(), self.backend_b.clone()),
+            (self.channel_a.clone(), self.backend_a.lock().clone()),
+            (self.channel_b.clone(), self.backend_b.lock().clone()),
         ];
 
         let mut threads = self.threads.lock();
@@ -691,46 +742,75 @@ impl Device for Z85c30 {
 
             threads.push(thread::Builder::new().name(format!("SCC-RX-{}", ch_name)).spawn(move || {
                 let mut last_rx_time = Instant::now();
+                // When the SCC's 8-byte rx_queue is full, hold the just-read
+                // byte here and retry on the next iteration instead of
+                // dropping it. This prevents loss when the host pushes a
+                // long line into CiSerialBackend faster than IRIX's tty
+                // driver clocks bytes off rx_queue. Without this hold, a
+                // ~30-char `dd if=/dev/rdsk/dks0d2s0 bs=512` arrives at
+                // the shell as `dd if=/d=512` (chars 9..24 dropped).
+                let mut pending: Option<u8> = None;
 
                 while running.load(Ordering::Relaxed) {
-                    if let Ok(mut byte) = rx_backend.recv_byte() {
-                        if byte == 0x05 {
-                            crate::dlog_dev!(LogModule::Scc, "SCC: Converting ^E to ^D (BREAK)");
-                            byte = 0x04;
-                        }
-                        let (lock, _cvar) = &*rx_channel;
-                        let mut channel = lock.lock();
-                        
-                        let wr3 = channel.regs[scc_regs::WR3 as usize];
-                        let rx_enabled = (wr3 & wr3::RX_ENABLE) != 0;
-
-                        // Get pre-calculated delay
-                        let delay_micros = channel.tx_delay;
-                        let char_duration = Duration::from_micros(delay_micros);
-
-                        if rx_enabled && channel.rx_queue.len() < 8 {
-                            crate::dlog_dev!(LogModule::Scc, "SCC: RX({}) '{}' ({:02x})", channel.name, if byte.is_ascii_graphic() { byte as char } else { '.' }, byte);
-                            channel.rx_queue.push_back(byte);
-                            channel.status |= rr0::RX_CHAR_AVAILABLE;
-                            channel.update_ip();
-                        }
+                    let mut byte = match pending.take() {
+                        Some(b) => b,
+                        None => match rx_backend.recv_byte() {
+                            Ok(b) => b,
+                            Err(_) => {
+                                thread::sleep(Duration::from_millis(10));
+                                continue;
+                            }
+                        },
+                    };
+                    if byte == 0x05 {
+                        crate::dlog_dev!(LogModule::Scc, "SCC: Converting ^E to ^D (BREAK)");
+                        byte = 0x04;
+                    }
+
+                    let (lock, _cvar) = &*rx_channel;
+                    let mut channel = lock.lock();
+
+                    let wr3 = channel.regs[scc_regs::WR3 as usize];
+                    let rx_enabled = (wr3 & wr3::RX_ENABLE) != 0;
+                    let delay_micros = channel.tx_delay;
+                    let char_duration = Duration::from_micros(delay_micros);
 
+                    if !rx_enabled {
+                        // RX disabled — drop the byte (matches real hw with
+                        // RX off). Don't hold it in `pending` or we'd block
+                        // forever waiting for re-enable.
                         drop(channel);
+                        continue;
+                    }
 
-                        let now = Instant::now();
-                        if last_rx_time < now {
-                            if now.duration_since(last_rx_time) > Duration::from_millis(100) {
-                                last_rx_time = now;
-                            }
-                        }
-                        last_rx_time += char_duration;
-                        let wait = last_rx_time.saturating_duration_since(now);
-                        if !wait.is_zero() {
-                            thread::sleep(wait);
+                    if channel.rx_queue.len() >= 8 {
+                        // SCC FIFO full. Hold the byte and back off briefly
+                        // so the guest's tty driver gets a chance to drain
+                        // rx_queue. Don't drop — that's the bug this
+                        // section fixes.
+                        drop(channel);
+                        pending = Some(byte);
+                        thread::sleep(Duration::from_millis(1));
+                        continue;
+                    }
+
+                    crate::dlog_dev!(LogModule::Scc, "SCC: RX({}) '{}' ({:02x})", channel.name, if byte.is_ascii_graphic() { byte as char } else { '.' }, byte);
+                    channel.rx_queue.push_back(byte);
+                    channel.status |= rr0::RX_CHAR_AVAILABLE;
+                    channel.update_ip();
+                    drop(channel);
+
+                    // Pacing — simulate baud-rate inter-character spacing.
+                    let now = Instant::now();
+                    if last_rx_time < now {
+                        if now.duration_since(last_rx_time) > Duration::from_millis(100) {
+                            last_rx_time = now;
                         }
-                    } else {
-                        // Avoid busy loop on error
-                        thread::sleep(Duration::from_millis(10));
+                    }
+                    last_rx_time += char_duration;
+                    let wait = last_rx_time.saturating_duration_since(now);
+                    if !wait.is_zero() {
+                        thread::sleep(wait);
                     }
                 }
             }).unwrap());
@@ -834,3 +914,195 @@ impl Saveable for Z85c30 {
         Ok(())
     }
 }
+
+// ============================================================================
+// CiSerialBackend — in-process serial backend used by --ci mode.
+// ============================================================================
+
+/// Serial backend that the CI control socket reads from and writes to. The
+/// guest sees this as channel A (the IRIX console). Host pushes bytes into
+/// `host_to_guest` via `push_host`; the existing RX thread drains them into
+/// `channel_a.rx_queue`. Guest output reaches `send_byte`, which pushes into
+/// `guest_to_host` and wakes anyone waiting in `wait_for`.
+pub struct CiSerialBackend {
+    host_to_guest: Mutex<VecDeque<u8>>,
+    guest_to_host: Mutex<Vec<u8>>,
+    cv: Condvar,
+}
+
+impl CiSerialBackend {
+    pub fn new() -> Self {
+        Self {
+            host_to_guest: Mutex::new(VecDeque::new()),
+            guest_to_host: Mutex::new(Vec::new()),
+            cv: Condvar::new(),
+        }
+    }
+
+    /// Inject bytes from host to guest (the harness typing on the console).
+    pub fn push_host(&self, data: &[u8]) {
+        let mut q = self.host_to_guest.lock();
+        q.extend(data.iter().copied());
+    }
+
+    /// Drain everything the guest has produced since the last call. Empties
+    /// the buffer; the returned Vec is the guest output as raw bytes.
+    pub fn drain_guest(&self) -> Vec<u8> {
+        let mut q = self.guest_to_host.lock();
+        std::mem::take(&mut *q)
+    }
+
+    /// Block until `needle` is seen in guest output, or `timeout` expires.
+    /// On success returns the consumed bytes up to and including the match;
+    /// bytes that arrived after the match stay in the buffer for the next
+    /// `serial-read`. On timeout returns `None` without consuming anything.
+    pub fn wait_for(&self, needle: &[u8], timeout: Duration) -> Option<Vec<u8>> {
+        if needle.is_empty() {
+            return Some(Vec::new());
+        }
+        let deadline = Instant::now() + timeout;
+        let mut q = self.guest_to_host.lock();
+        loop {
+            if let Some(pos) = find_subseq(&q, needle) {
+                let end = pos + needle.len();
+                let consumed: Vec<u8> = q.drain(..end).collect();
+                return Some(consumed);
+            }
+            let now = Instant::now();
+            if now >= deadline {
+                return None;
+            }
+            if self.cv.wait_until(&mut q, deadline).timed_out() {
+                // One more scan in case bytes arrived between the last check
+                // and the timeout.
+                if let Some(pos) = find_subseq(&q, needle) {
+                    let end = pos + needle.len();
+                    let consumed: Vec<u8> = q.drain(..end).collect();
+                    return Some(consumed);
+                }
+                return None;
+            }
+        }
+    }
+
+    /// Clear both queues. Called on `restore`/`rollback` so stale serial
+    /// output from the previous run doesn't leak into the next test.
+    pub fn reset(&self) {
+        self.host_to_guest.lock().clear();
+        self.guest_to_host.lock().clear();
+    }
+}
+
+fn find_subseq(haystack: &[u8], needle: &[u8]) -> Option<usize> {
+    if needle.is_empty() || haystack.len() < needle.len() {
+        return None;
+    }
+    haystack.windows(needle.len()).position(|w| w == needle)
+}
+
+impl SerialBackend for CiSerialBackend {
+    fn send_byte(&self, byte: u8) {
+        self.guest_to_host.lock().push(byte);
+        self.cv.notify_all();
+    }
+
+    fn recv_byte(&self) -> io::Result<u8> {
+        let mut q = self.host_to_guest.lock();
+        match q.pop_front() {
+            Some(b) => Ok(b),
+            None => Err(io::Error::new(io::ErrorKind::WouldBlock, "empty")),
+        }
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    /// Phase 1.7 round-trip: a fresh SCC loaded from a captured save_state must
+    /// re-serialize byte-identically. Use new_null so the test doesn't bind any
+    /// TCP ports.
+    #[test]
+    fn save_load_round_trip() {
+        let src = Z85c30::new_null(None);
+        {
+            let mut ch = src.channel_a.0.lock();
+            ch.regs[0]  = 0x44;
+            ch.regs[1]  = 0x12;
+            ch.regs[3]  = 0xc1;
+            ch.regs[5]  = 0xea;
+            ch.reg_ptr  = 7;
+            ch.status   = 0x40;
+        }
+        {
+            let mut ch = src.channel_b.0.lock();
+            ch.regs[0]  = 0x88;
+            ch.regs[2]  = 0x10;
+            ch.regs[15] = 0x05;
+            ch.reg_ptr  = 3;
+            ch.status   = 0x80;
+        }
+        let v1 = src.save_state();
+
+        let dst = Z85c30::new_null(None);
+        dst.load_state(&v1).expect("load_state");
+        let v2 = dst.save_state();
+
+        assert_eq!(v1, v2, "Z85c30 save_state mismatch after load_state round-trip");
+    }
+
+    /// Phase 3.5: a long single-line `serial-send` from the host must arrive
+    /// at the guest's tty intact. Before the rx-thread fix, bytes 9..N of any
+    /// burst were silently dropped when SCC's 8-byte rx_queue filled — a 53-
+    /// char `dd if=/dev/rdsk/dks0d2s0 of=/tmp/r.bin bs=512 count=1\r` arrived
+    /// at IRIX as `dd if=/d=512 count=1`, causing CI scripts to fabricate
+    /// shell errors out of thin air. This test pushes that exact line through
+    /// the loopback CiSerialBackend, drains the SCC rx_queue at the rate the
+    /// IRIX kernel would (one byte at a time, polled), and asserts every
+    /// byte arrives.
+    #[test]
+    fn long_input_round_trips_without_loss() {
+        use std::sync::Arc;
+        use std::time::{Duration, Instant};
+
+        let scc = Z85c30::new_null(None);
+        let backend = Arc::new(CiSerialBackend::new());
+        scc.set_backend_a(backend.clone());
+
+        // Enable RX on channel A so the rx thread queues bytes. tx_delay is
+        // tx-direction baud-rate emulation; set a small value so the test
+        // doesn't pay 19.2 kbaud-per-char latency.
+        {
+            let mut ch = scc.channel_a.0.lock();
+            ch.regs[scc_regs::WR3 as usize] |= wr3::RX_ENABLE;
+            ch.tx_delay = 50; // 50 µs/byte
+        }
+
+        scc.start();
+
+        let line = b"dd if=/dev/rdsk/dks0d2s0 of=/tmp/r.bin bs=512 count=1\r";
+        backend.push_host(line);
+
+        // Drain rx_queue at ~20 kHz so the rx thread always has space to
+        // push pending bytes. Mirrors how IRIX's tty driver consumes
+        // RR0::RX_CHAR_AVAILABLE.
+        let mut received = Vec::with_capacity(line.len());
+        let deadline = Instant::now() + Duration::from_secs(5);
+        while received.len() < line.len() && Instant::now() < deadline {
+            let popped = {
+                let mut ch = scc.channel_a.0.lock();
+                ch.rx_queue.pop_front()
+            };
+            match popped {
+                Some(b) => received.push(b),
+                None    => std::thread::sleep(Duration::from_micros(50)),
+            }
+        }
+
+        scc.stop();
+
+        assert_eq!(received.len(), line.len(),
+            "expected {} bytes, got {} (lossy rx_queue?)", line.len(), received.len());
+        assert_eq!(&received, line, "byte content mismatch — bytes dropped or reordered");
+    }
+}
diff --git a/tools/iris-test b/tools/iris-test
new file mode 100755
index 0000000..2700a7b
--- /dev/null
+++ b/tools/iris-test
@@ -0,0 +1,270 @@
+#!/usr/bin/env python3
+"""iris-test — drive the IRIS --ci control socket against a test spec.
+
+A test is a YAML file. Steps currently supported:
+
+  - type: serial
+    send: "rpm -ivh /tmp/grep.rpm\n"
+    expect: "complete"
+    timeout: 30
+
+  - type: sleep
+    seconds: 2
+
+See ci_mode_plan.md for the full schema. Output is a one-line summary per
+test (PASS/FAIL + elapsed), plus a raw-log directory on failure.
+
+Usage:
+    tools/iris-test path/to/test.yaml
+
+Optional flags:
+    --socket PATH     Unix socket path (default: /tmp/iris.sock)
+    --verbose         Dump every RPC and raw serial output
+    --no-restore      Skip the restore step (useful when driving a
+                      pre-booted emulator manually)
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import socket
+import sys
+import time
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any
+
+
+# ----------------------------------------------------------------------------
+# YAML without PyYAML dependency — simple subset parser.
+# Supports the schema in ci_mode_plan.md. If the user has PyYAML installed we
+# prefer that; otherwise fall back to a minimal parser for our schema.
+# ----------------------------------------------------------------------------
+
+def load_yaml(path: Path) -> dict[str, Any]:
+    try:
+        import yaml  # type: ignore
+        with path.open() as f:
+            return yaml.safe_load(f)
+    except ImportError:
+        return _mini_yaml(path.read_text())
+
+
+def _mini_yaml(text: str) -> dict[str, Any]:
+    """A tiny YAML subset: top-level scalars, a 'steps' list of dicts with
+    scalar values. No anchors, flows, or nested structures."""
+    root: dict[str, Any] = {}
+    steps: list[dict[str, Any]] = []
+    in_steps = False
+    current: dict[str, Any] | None = None
+
+    def coerce(v: str) -> Any:
+        if (v.startswith('"') and v.endswith('"')) or (v.startswith("'") and v.endswith("'")):
+            return v[1:-1].encode().decode("unicode_escape")
+        try:
+            if "." in v:
+                return float(v)
+            return int(v)
+        except ValueError:
+            return v
+
+    for raw in text.splitlines():
+        line = raw.rstrip()
+        if not line or line.lstrip().startswith("#"):
+            continue
+        if not line.startswith(" "):
+            if line == "steps:":
+                in_steps = True
+                continue
+            k, _, v = line.partition(":")
+            root[k.strip()] = coerce(v.strip())
+            continue
+        stripped = line.lstrip()
+        if stripped.startswith("- "):
+            if current is not None:
+                steps.append(current)
+            current = {}
+            stripped = stripped[2:]
+        if current is None:
+            continue
+        k, _, v = stripped.partition(":")
+        current[k.strip()] = coerce(v.strip())
+    if current is not None:
+        steps.append(current)
+    if in_steps:
+        root["steps"] = steps
+    return root
+
+
+# ----------------------------------------------------------------------------
+# Control-socket client.
+# ----------------------------------------------------------------------------
+
+class CiClient:
+    def __init__(self, path: str, verbose: bool = False):
+        self.verbose = verbose
+        self.sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
+        try:
+            self.sock.connect(path)
+        except (FileNotFoundError, ConnectionRefusedError) as e:
+            raise SystemExit(
+                f"iris-test: no iris --ci listener at {path} ({e}).\n"
+                f"  Start one first, e.g. in another terminal:\n"
+                f"    ./target/release/iris --ci --ci-socket {path}"
+            )
+        self.f = self.sock.makefile("rw", buffering=1, newline="\n")
+
+    def rpc(self, cmd: str, **args: Any) -> dict[str, Any]:
+        req = {"cmd": cmd, "args": args}
+        self.f.write(json.dumps(req) + "\n")
+        self.f.flush()
+        line = self.f.readline()
+        if not line:
+            raise RuntimeError(f"socket EOF while waiting for {cmd!r}")
+        resp = json.loads(line)
+        if self.verbose:
+            print(f"  rpc {cmd} {args} -> {resp}", file=sys.stderr)
+        return resp
+
+    def close(self) -> None:
+        try:
+            self.sock.close()
+        except OSError:
+            pass
+
+
+# ----------------------------------------------------------------------------
+# Step runners.
+# ----------------------------------------------------------------------------
+
+@dataclass
+class StepResult:
+    ok: bool
+    message: str
+    elapsed_s: float
+    captured: str = ""
+
+
+def run_serial(client: CiClient, step: dict[str, Any]) -> StepResult:
+    start = time.monotonic()
+    send = step.get("send", "")
+    expect = step.get("expect")
+    timeout = int(step.get("timeout", 30))
+
+    if send:
+        resp = client.rpc("serial-send", data=send)
+        if not resp.get("ok"):
+            return StepResult(False, f"serial-send failed: {resp.get('error')}", time.monotonic() - start)
+
+    captured = ""
+    if expect:
+        resp = client.rpc("wait-serial", pattern=expect, timeout_ms=timeout * 1000)
+        if not resp.get("ok"):
+            # Drain whatever is there so it shows up in failure output.
+            drained = client.rpc("serial-read").get("data", "")
+            return StepResult(False, resp.get("error", "wait-serial failed"),
+                              time.monotonic() - start, captured=drained)
+        captured = resp.get("data", "")
+
+    return StepResult(True, f"matched {expect!r}" if expect else "sent", time.monotonic() - start, captured=captured)
+
+
+def run_sleep(_client: CiClient, step: dict[str, Any]) -> StepResult:
+    start = time.monotonic()
+    secs = float(step.get("seconds", 1))
+    time.sleep(secs)
+    return StepResult(True, f"slept {secs}s", time.monotonic() - start)
+
+
+def run_screenshot(client: CiClient, step: dict[str, Any]) -> StepResult:
+    start = time.monotonic()
+    path = step.get("path", "/tmp/iris-ci-screenshot.png")
+    min_bytes = int(step.get("min_bytes", 1024))  # sanity lower bound
+
+    resp = client.rpc("screenshot", path=path)
+    if not resp.get("ok"):
+        return StepResult(False, f"screenshot: {resp.get('error')}", time.monotonic() - start)
+    try:
+        st = Path(path).stat()
+    except OSError as e:
+        return StepResult(False, f"screenshot file missing: {e}", time.monotonic() - start)
+    if st.st_size < min_bytes:
+        return StepResult(False, f"screenshot too small ({st.st_size} bytes, need >={min_bytes})",
+                          time.monotonic() - start)
+    dims = resp.get("data", {})
+    w, h = dims.get("width"), dims.get("height")
+    return StepResult(True, f"screenshot {w}x{h}, {st.st_size} bytes -> {path}",
+                      time.monotonic() - start)
+
+
+STEP_RUNNERS = {
+    "serial": run_serial,
+    "sleep": run_sleep,
+    "screenshot": run_screenshot,
+}
+
+
+# ----------------------------------------------------------------------------
+# Main.
+# ----------------------------------------------------------------------------
+
+def run_test(spec: dict[str, Any], socket_path: str, verbose: bool, no_restore: bool) -> int:
+    name = spec.get("name", "unnamed")
+    snapshot = spec.get("snapshot")
+    steps = spec.get("steps", [])
+
+    overall_start = time.monotonic()
+    client = CiClient(socket_path, verbose=verbose)
+
+    try:
+        if not no_restore:
+            if not snapshot:
+                print(f"FAIL  {name}  (spec missing 'snapshot')", file=sys.stderr)
+                return 2
+            resp = client.rpc("restore", name=snapshot)
+            if not resp.get("ok"):
+                print(f"FAIL  {name}  (restore: {resp.get('error')})", file=sys.stderr)
+                return 2
+
+        for i, step in enumerate(steps):
+            kind = step.get("type", "")
+            runner = STEP_RUNNERS.get(kind)
+            if runner is None:
+                print(f"FAIL  {name}  (step {i}: unknown type {kind!r})", file=sys.stderr)
+                return 2
+            result = runner(client, step)
+            if not result.ok:
+                print(f"FAIL  {name}  step {i} ({kind}): {result.message}")
+                if result.captured:
+                    print("--- serial output ---")
+                    print(result.captured, end="")
+                    if not result.captured.endswith("\n"):
+                        print()
+                    print("--- end ---")
+                return 1
+            if verbose:
+                print(f"  step {i} ({kind}, {result.elapsed_s:.1f}s): {result.message}",
+                      file=sys.stderr)
+
+        total = time.monotonic() - overall_start
+        print(f"PASS  {name}  ({total:.1f}s)")
+        return 0
+    finally:
+        client.close()
+
+
+def main() -> int:
+    ap = argparse.ArgumentParser(description="Run an IRIS --ci test spec")
+    ap.add_argument("spec", type=Path)
+    ap.add_argument("--socket", default="/tmp/iris.sock")
+    ap.add_argument("--verbose", action="store_true")
+    ap.add_argument("--no-restore", action="store_true")
+    args = ap.parse_args()
+
+    spec = load_yaml(args.spec)
+    return run_test(spec, args.socket, args.verbose, args.no_restore)
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/tools/tests/prom-smoke.yaml b/tools/tests/prom-smoke.yaml
new file mode 100644
index 0000000..a90f3dc
--- /dev/null
+++ b/tools/tests/prom-smoke.yaml
@@ -0,0 +1,11 @@
+name: prom-smoke
+# Minimal sanity: cold-boot into PROM, wait for any banner output.
+# No snapshot — start the CPU fresh. Use --no-restore with iris-test.
+
+steps:
+  - type: sleep
+    seconds: 1
+
+  - type: serial
+    expect: "SGI"
+    timeout: 60
diff --git a/tools/tests/restore-smoke.yaml b/tools/tests/restore-smoke.yaml
new file mode 100644
index 0000000..8f3e93c
--- /dev/null
+++ b/tools/tests/restore-smoke.yaml
@@ -0,0 +1,15 @@
+name: restore-smoke
+# Loads the snapshot you captured interactively and verifies the guest is
+# alive: send a CR and wait for an echo. Works regardless of whether the
+# snapshot is in PROM, the maintenance menu, or an IRIX shell — any of
+# them will echo a newline or re-display their prompt.
+snapshot: working2
+
+steps:
+  - type: sleep
+    seconds: 1
+
+  - type: serial
+    send: "\r"
+    expect: "\n"
+    timeout: 10
diff --git a/tools/tests/screenshot-smoke.yaml b/tools/tests/screenshot-smoke.yaml
new file mode 100644
index 0000000..142407d
--- /dev/null
+++ b/tools/tests/screenshot-smoke.yaml
@@ -0,0 +1,17 @@
+name: screenshot-smoke
+# Restores the IRIX snapshot and captures a screenshot. Verifies that:
+#   1. restore succeeded (snapshot machinery works end-to-end including overlay)
+#   2. REX3 is alive and rendering in --ci mode (no --headless)
+#   3. The framebuffer has plausible content (file size sanity check)
+#
+# This is the "GUI mode smoke" — doesn't rely on serial console output.
+snapshot: working3
+
+steps:
+  # Give REX3's refresh thread a beat to composite a frame after restore.
+  - type: sleep
+    seconds: 2
+
+  - type: screenshot
+    path: /tmp/iris-screenshot.png
+    min_bytes: 10240