fix: persist validator+miner state across container restarts#323
Merged
Conversation
Two compounding bugs were causing the first post-restart scoring round
to route 100% of emissions to RECYCLE_UID (UID 53):
1) docker-compose.{vali,miner}.yml only mounted ~/.bittensor/wallets,
not ~/.allways. State.db (rate_events, swap_outcomes,
pending_confirms for the validator; sent_cache + rate_posted flags
for the miner) lived in the container's writable layer and got
destroyed every time watchtower pulled a new image. Mount
./data/allways:/root/.allways so state survives container recreate.
2) Even after fixing the mount, a brand-new validator (or one whose
state.db was wiped) had no rate event visible at window_start, so
reconstruct_window_start_state returned an empty rates dict on the
first scoring pass — every miner read as 'no rate posted', no
crown was awarded, full pool recycled. Add bootstrap_miner_rates()
which reads current on-chain commitments at init and seeds one
anchor rate event per (hotkey, direction) at cursor. Mirrors what
event_watcher.initialize already does for active flags.
Tested: 411 tests pass. Lena confirmed only the wallets dir was bound
mounted, so the writable layer was being destroyed on every watchtower
update.
anderdc
approved these changes
May 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The validator running on lena tonight lost its entire `state.db` when
watchtower pulled the new v1.0.2 image — the writable layer of the old
container was destroyed by `cleanup=true`. Result: first post-restart
scoring round routed 100 % of emissions to UID 53 (RECYCLE).
We rely on SQLite persistence for `rate_events`, `swap_outcomes`,
and `pending_confirms`. The code was written assuming `~/.allways/`
survives restarts; the deployment config was defeating that.
What
1. Volume mount (deploy fix)
`docker-compose.{vali,miner}.yml` were only binding `~/.bittensor/wallets`.
Add `./data/allways:/root/.allways` so state.db (validator) and
sent_cache / rate-posted flags (miner) survive watchtower-driven
container recreates.
2. Rate bootstrap from chain (code fix)
Even with the mount in place, a brand-new validator or one whose
state.db was wiped would still hit the same bug on its first scoring
round — `reconstruct_window_start_state` had no rate visible at
`window_start`, so every miner read as "no rate posted" and the
entire pool recycled.
Add `Validator.bootstrap_miner_rates()` (~30 lines) called once after
`event_watcher.initialize`. Reads current on-chain commitments and
seeds one anchor rate event per (hotkey, direction) at
`cursor = current_block − SCORING_WINDOW_BLOCKS`. Only inserts if no
event already exists at that cursor (idempotent across restarts).
Mirrors the active-flag anchoring `event_watcher.initialize` already
does, via a normal contract storage read (no archive node needed).
Operator action after merge
For existing deployments, after pulling and recreating:
```bash
mkdir -p ./data/allways # next to docker-compose.yml
docker compose up -d # picks up the new mount
```
Subsequent watchtower updates preserve state automatically.
Test plan