Conversation
🤖 Kimi Code ReviewReview SummaryThis PR reorganizes the devnet-runner documentation by splitting large files into focused references and adding long-lived devnet capabilities. The changes are primarily documentation reorganization with one minor script improvement. Issues Found
Positive Changes
Suggestions
Automated review by Kimi (Moonshot AI) · custom prompt |
🤖 Codex Code Review
No Rust consensus-path code ( Automated review by OpenAI Codex · custom prompt |
🤖 Claude Code ReviewI have all the information I need from the diff. Here is the review: PR #226 Review:
|
Greptile SummaryThis PR reorganises the Key observations:
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| .claude/skills/devnet-runner/SKILL.md | Primary skill file refactored to reduce verbosity by moving detailed content (validator schema, client tables, checkpoint sync procedure) into dedicated reference files; description updated to enumerate more trigger phrases for skill activation. |
| .claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh | Added explicit container cleanup loop after killing spin-node.sh to handle cases where spin-node.sh doesn't remove containers on signal; minor inaccuracy in the "Stopping containers..." status message (docker rm -f stops and removes). |
| .claude/skills/devnet-runner/references/long-lived-devnet.md | New reference file for persistent detached devnets and rolling restarts; the docker run example for multi-node ethlambda deployments omits --api-port, which causes port 5052 conflicts when running multiple nodes on the same host. |
| .claude/skills/devnet-runner/references/validator-config.md | New reference file extracted from SKILL.md; contains full validator-config.yaml schema, field reference table, add/remove node procedures, port allocation guide, and local vs ansible deployment comparison — all accurate and consistent with the rest of the skill. |
| .claude/skills/devnet-runner/references/clients.md | Per-client configuration notes removed and replaced with a pointer to client-cmds scripts; retained Docker image table, port defaults, known compatibility issues, and environment variable reference — well-organised. |
| .claude/skills/devnet-runner/references/checkpoint-sync.md | Deleted; the spin-node.sh --restart-client procedure it documented is no longer covered anywhere in the skill — local devnet checkpoint-sync restarts are now undocumented. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A([User Request]) --> B{Devnet type?}
B -->|Short local test| C[Edit validator-config.yaml]
B -->|Long-lived / remote| D[Generate genesis via spin-node.sh]
C --> E[Update client image tags]
E --> F[run-devnet-with-timeout.sh <seconds>]
F --> G[spin-node.sh --generateGenesis runs in background]
G --> H[sleep <seconds>]
H --> I[Dump logs to *.log files]
I --> J[kill spin-node.sh]
J --> K[docker rm -f remaining containers]
D --> L[docker run -d --restart unless-stopped\nper node with explicit ports]
L --> M{Upgrade needed?}
M -->|No| N([Devnet running persistently])
M -->|Yes| O[docker pull new image]
O --> P[docker rm -f node]
P --> Q[sleep 60s — gossipsub backoff]
Q --> R[docker run -d new image\n+ --checkpoint-sync-url]
R --> S{Verify gossip\n+ finalization}
S -->|Pass| T{More nodes\nto restart?}
S -->|Fail| Q
T -->|Yes| O
T -->|No| N
Prompt To Fix All With AI
This is a comment left during a code review.
Path: .claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh
Line: 37
Comment:
**Misleading status message**
The message says "Stopping containers..." but `docker rm -f` both stops and removes containers. A more accurate message avoids confusion when reading script output.
```suggestion
echo "Stopping and removing containers..."
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: .claude/skills/devnet-runner/references/long-lived-devnet.md
Line: 50-58
Comment:
**Missing `--api-port` causes port conflicts for multi-node runs**
The example `docker run` command omits `--api-port`. When multiple ethlambda nodes run on the same host (as shown with `ethlambda_0` through `ethlambda_3`), they all default to `--api-port 5052`. Only one node can bind to that port; the rest fail silently or refuse connections.
This directly affects the rolling restart procedure, where `CHECKPOINT_SOURCE_PORT=5052` is used as the checkpoint URL port — if only one node is actually on 5052, the choice of source node is constrained but undocumented.
Consider adding `--api-port` explicitly to the example, matching the per-node port assignment table in `validator-config.md` (e.g., ethlambda_0 → 5052, ethlambda_1 → 5053, etc.):
```bash
docker run -d --restart unless-stopped --name ethlambda_0 --network host \
-v $GENESIS:/config -v $DATA/ethlambda_0:/data \
$IMAGE \
--custom-network-config-dir /config \
--gossipsub-port 9001 --node-id ethlambda_0 \
--node-key /config/ethlambda_0.key \
--api-port 5052 \
--metrics-address 0.0.0.0 --metrics-port 8081
# ethlambda_1
docker run -d --restart unless-stopped --name ethlambda_1 --network host \
-v $GENESIS:/config -v $DATA/ethlambda_1:/data \
$IMAGE \
...
--api-port 5053 \
--metrics-port 8082
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 9eec519
| wait $PID 2>/dev/null | ||
|
|
||
| # Explicitly stop and remove containers (spin-node.sh may not clean up on kill) | ||
| echo "Stopping containers..." |
There was a problem hiding this comment.
Misleading status message
The message says "Stopping containers..." but docker rm -f both stops and removes containers. A more accurate message avoids confusion when reading script output.
| echo "Stopping containers..." | |
| echo "Stopping and removing containers..." |
Prompt To Fix With AI
This is a comment left during a code review.
Path: .claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh
Line: 37
Comment:
**Misleading status message**
The message says "Stopping containers..." but `docker rm -f` both stops and removes containers. A more accurate message avoids confusion when reading script output.
```suggestion
echo "Stopping and removing containers..."
```
How can I resolve this? If you propose a fix, please make it concise.| # Start each node (adjust ports, node-id, aggregator flag per validator-config.yaml) | ||
| docker run -d --restart unless-stopped --name ethlambda_0 --network host \ | ||
| -v $GENESIS:/config -v $DATA/ethlambda_0:/data \ | ||
| $IMAGE \ | ||
| --custom-network-config-dir /config \ | ||
| --gossipsub-port 9001 --node-id ethlambda_0 \ | ||
| --node-key /config/ethlambda_0.key \ | ||
| --metrics-address 0.0.0.0 --metrics-port 8081 | ||
|
|
There was a problem hiding this comment.
Missing --api-port causes port conflicts for multi-node runs
The example docker run command omits --api-port. When multiple ethlambda nodes run on the same host (as shown with ethlambda_0 through ethlambda_3), they all default to --api-port 5052. Only one node can bind to that port; the rest fail silently or refuse connections.
This directly affects the rolling restart procedure, where CHECKPOINT_SOURCE_PORT=5052 is used as the checkpoint URL port — if only one node is actually on 5052, the choice of source node is constrained but undocumented.
Consider adding --api-port explicitly to the example, matching the per-node port assignment table in validator-config.md (e.g., ethlambda_0 → 5052, ethlambda_1 → 5053, etc.):
docker run -d --restart unless-stopped --name ethlambda_0 --network host \
-v $GENESIS:/config -v $DATA/ethlambda_0:/data \
$IMAGE \
--custom-network-config-dir /config \
--gossipsub-port 9001 --node-id ethlambda_0 \
--node-key /config/ethlambda_0.key \
--api-port 5052 \
--metrics-address 0.0.0.0 --metrics-port 8081
# ethlambda_1
docker run -d --restart unless-stopped --name ethlambda_1 --network host \
-v $GENESIS:/config -v $DATA/ethlambda_1:/data \
$IMAGE \
...
--api-port 5053 \
--metrics-port 8082Prompt To Fix With AI
This is a comment left during a code review.
Path: .claude/skills/devnet-runner/references/long-lived-devnet.md
Line: 50-58
Comment:
**Missing `--api-port` causes port conflicts for multi-node runs**
The example `docker run` command omits `--api-port`. When multiple ethlambda nodes run on the same host (as shown with `ethlambda_0` through `ethlambda_3`), they all default to `--api-port 5052`. Only one node can bind to that port; the rest fail silently or refuse connections.
This directly affects the rolling restart procedure, where `CHECKPOINT_SOURCE_PORT=5052` is used as the checkpoint URL port — if only one node is actually on 5052, the choice of source node is constrained but undocumented.
Consider adding `--api-port` explicitly to the example, matching the per-node port assignment table in `validator-config.md` (e.g., ethlambda_0 → 5052, ethlambda_1 → 5053, etc.):
```bash
docker run -d --restart unless-stopped --name ethlambda_0 --network host \
-v $GENESIS:/config -v $DATA/ethlambda_0:/data \
$IMAGE \
--custom-network-config-dir /config \
--gossipsub-port 9001 --node-id ethlambda_0 \
--node-key /config/ethlambda_0.key \
--api-port 5052 \
--metrics-address 0.0.0.0 --metrics-port 8081
# ethlambda_1
docker run -d --restart unless-stopped --name ethlambda_1 --network host \
-v $GENESIS:/config -v $DATA/ethlambda_1:/data \
$IMAGE \
...
--api-port 5053 \
--metrics-port 8082
```
How can I resolve this? If you propose a fix, please make it concise.
This PR cleans up and improves the devnet-runner skill