fix(epoch_settler): replace print() with logging, add env-var config (M8)#6310
fix(epoch_settler): replace print() with logging, add env-var config (M8)#6310waefrebeorn wants to merge 35 commits into
Conversation
Adds max_length parameter to _clean_string_field and caps all user input fields in POST route handlers: - /lock: sender_wallet(128), target_wallet(128), tx_hash(128), receipt_signature(256) - /confirm: proof_ref(256), notes(1024) - /release: release_tx(128), notes(1024) Prevents storage of arbitrarily large strings in bridge_ledger DB.
…s + Row M error handling + Row T test gaps + Row E infrastructure
…(M8) auto_epoch_settler.py had critical error handling flaws: 1. **print() instead of logging** — a background daemon using stdout for error messages means failures are invisible in systemd, Docker, or nohup deployments. Replaced all 14 print() calls with logger calls at appropriate levels (info/warning/error). 2. **Hardcoded paths** — NODE_URL, DB_PATH, CHECK_INTERVAL, and SLOTS_PER_EPOCH were constants. Changed to environment variables with sensible defaults (RUSTCHAIN_NODE_URL, RUSTCHAIN_DB_PATH, RUSTCHAIN_SETTLE_INTERVAL, RUSTCHAIN_SLOTS_PER_EPOCH) for deployment flexibility. 3. **Broad except swallowed specifics** — all Exception catches used print() with no structured data. Split into: - requests.RequestException for network failures - sqlite3.Error for database errors - Exception as catch-all with logger.exception for stack traces 4. **No logging config** — added logging.basicConfig in __main__ with timestamps and module name. Routes to stderr (syslog- compatible in systemd).
jaxint
left a comment
There was a problem hiding this comment.
LGTM! Great work on this PR. 🚀
shadow88sky
left a comment
There was a problem hiding this comment.
Reviewed the M8 epoch-settler logging/config change and the full submitted branch.
There is a blocking config-handling regression in node/auto_epoch_settler.py: the new integer environment variables are parsed at module import time with bare int(...) calls. If RUSTCHAIN_SETTLE_INTERVAL or RUSTCHAIN_SLOTS_PER_EPOCH is set to a non-integer value, the daemon crashes before logging.basicConfig(...) runs and before the main loop can catch/log anything.
Minimal reproductions on this branch:
RUSTCHAIN_SETTLE_INTERVAL=abc .venv/bin/python - <<PY
import importlib.util
spec = importlib.util.spec_from_file_location("auto_epoch_settler_under_test", "node/auto_epoch_settler.py")
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
PY
# ValueError: invalid literal for int() with base 10: abc
RUSTCHAIN_SLOTS_PER_EPOCH=bad .venv/bin/python - <<PY
import importlib.util
spec = importlib.util.spec_from_file_location("auto_epoch_settler_under_test", "node/auto_epoch_settler.py")
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
PY
# ValueError: invalid literal for int() with base 10: bad
That undercuts the stated goal of making this daemon deployment-configurable and daemon-friendly. I would validate env vars explicitly, log a clear error, and fall back or exit from __main__ in a controlled way.
Additional validation:
.venv/bin/python -m py_compile node/auto_epoch_settler.py
passed
git diff --check HEAD~35..HEAD
failed
git diff --check reports trailing whitespace in unrelated files (faucet.py, node/machine_passport_api.py). The branch also changes 20 files and adds tools/comment-moderation-bot/test_logs/moderation_audit_2026-05-25.jsonl, which is unrelated to the M8 epoch-settler logging/config change. I recommend splitting the branch down to node/auto_epoch_settler.py plus focused tests for valid/invalid env config.
I received RTC compensation for this review.
|
Will replace this with a clean branch (no unrelated files, trailing whitespace fixed) |
Summary
M8 — MED: auto_epoch_settler.py has critical daemon error handling flaws
Problem
The automatic epoch settlement daemon used
print()for all output — in a background daemon, stdout/stderr may not be captured in systemd, Docker, or nohup deployments. Settlement failures were invisible to operators.Additionally, NODE_URL and DB_PATH were hardcoded, requiring code edits for deployment.
Changes
print() → logging — 14 print() calls replaced with logger calls at appropriate levels (info/warning/error). logger.exception() provides full stack traces on unexpected errors.
Hardcoded paths → env vars — NODE_URL, DB_PATH, CHECK_INTERVAL, SLOTS_PER_EPOCH changed from constants to os.environ.get() with sensible defaults (RUSTCHAIN_NODE_URL, RUSTCHAIN_DB_PATH, RUSTCHAIN_SETTLE_INTERVAL, RUSTCHAIN_SLOTS_PER_EPOCH).
Granular exception handling — split broad
except Exceptioninto:requests.RequestExceptionfor network failuressqlite3.Errorfor database errorsExceptionas catch-all with full stack trace via logger.exception()Logging config — added logging.basicConfig with timestamps and module name, routed to stderr (syslog-compatible in systemd/journald).
Testing
RTC Wallet for bounty:
RTC17c0d21f04f6f65c1a85c0aeb5d4a305d57531096