Skip to content

fix(epoch_settler): replace print() with logging, add env-var config (M8)#6310

Open
waefrebeorn wants to merge 35 commits into
Scottcjn:mainfrom
waefrebeorn:fix-m8-epoch-settler-logging
Open

fix(epoch_settler): replace print() with logging, add env-var config (M8)#6310
waefrebeorn wants to merge 35 commits into
Scottcjn:mainfrom
waefrebeorn:fix-m8-epoch-settler-logging

Conversation

@waefrebeorn
Copy link
Copy Markdown

Summary

M8 — MED: auto_epoch_settler.py has critical daemon error handling flaws

Problem

The automatic epoch settlement daemon used print() for all output — in a background daemon, stdout/stderr may not be captured in systemd, Docker, or nohup deployments. Settlement failures were invisible to operators.

Additionally, NODE_URL and DB_PATH were hardcoded, requiring code edits for deployment.

Changes

  1. print() → logging — 14 print() calls replaced with logger calls at appropriate levels (info/warning/error). logger.exception() provides full stack traces on unexpected errors.

  2. Hardcoded paths → env vars — NODE_URL, DB_PATH, CHECK_INTERVAL, SLOTS_PER_EPOCH changed from constants to os.environ.get() with sensible defaults (RUSTCHAIN_NODE_URL, RUSTCHAIN_DB_PATH, RUSTCHAIN_SETTLE_INTERVAL, RUSTCHAIN_SLOTS_PER_EPOCH).

  3. Granular exception handling — split broad except Exception into:

    • requests.RequestException for network failures
    • sqlite3.Error for database errors
    • Exception as catch-all with full stack trace via logger.exception()
  4. Logging config — added logging.basicConfig with timestamps and module name, routed to stderr (syslog-compatible in systemd/journald).

Testing

  • Syntax verified (Python AST parse)
  • All existing behavior preserved — logging level matches print severity
  • Environment variables all have defaults, zero-config backward compatible

RTC Wallet for bounty: RTC17c0d21f04f6f65c1a85c0aeb5d4a305d57531096

Adds max_length parameter to _clean_string_field and caps all user input
fields in POST route handlers:

- /lock: sender_wallet(128), target_wallet(128), tx_hash(128), receipt_signature(256)
- /confirm: proof_ref(256), notes(1024)
- /release: release_tx(128), notes(1024)

Prevents storage of arbitrarily large strings in bridge_ledger DB.
…s + Row M error handling + Row T test gaps + Row E infrastructure
…(M8)

auto_epoch_settler.py had critical error handling flaws:

1. **print() instead of logging** — a background daemon using stdout
   for error messages means failures are invisible in systemd, Docker,
   or nohup deployments. Replaced all 14 print() calls with logger
   calls at appropriate levels (info/warning/error).

2. **Hardcoded paths** — NODE_URL, DB_PATH, CHECK_INTERVAL, and
   SLOTS_PER_EPOCH were constants. Changed to environment variables
   with sensible defaults (RUSTCHAIN_NODE_URL, RUSTCHAIN_DB_PATH,
   RUSTCHAIN_SETTLE_INTERVAL, RUSTCHAIN_SLOTS_PER_EPOCH) for
   deployment flexibility.

3. **Broad except swallowed specifics** — all Exception catches used
   print() with no structured data. Split into:
   - requests.RequestException for network failures
   - sqlite3.Error for database errors
   - Exception as catch-all with logger.exception for stack traces

4. **No logging config** — added logging.basicConfig in __main__
   with timestamps and module name. Routes to stderr (syslog-
   compatible in systemd).
@github-actions github-actions Bot added documentation Improvements or additions to documentation BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related api API endpoint related size/L PR: 201-500 lines labels May 25, 2026
Copy link
Copy Markdown
Contributor

@jaxint jaxint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work on this PR. 🚀

Copy link
Copy Markdown

@shadow88sky shadow88sky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the M8 epoch-settler logging/config change and the full submitted branch.

There is a blocking config-handling regression in node/auto_epoch_settler.py: the new integer environment variables are parsed at module import time with bare int(...) calls. If RUSTCHAIN_SETTLE_INTERVAL or RUSTCHAIN_SLOTS_PER_EPOCH is set to a non-integer value, the daemon crashes before logging.basicConfig(...) runs and before the main loop can catch/log anything.

Minimal reproductions on this branch:

RUSTCHAIN_SETTLE_INTERVAL=abc .venv/bin/python - <<PY
import importlib.util
spec = importlib.util.spec_from_file_location("auto_epoch_settler_under_test", "node/auto_epoch_settler.py")
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
PY
# ValueError: invalid literal for int() with base 10: abc

RUSTCHAIN_SLOTS_PER_EPOCH=bad .venv/bin/python - <<PY
import importlib.util
spec = importlib.util.spec_from_file_location("auto_epoch_settler_under_test", "node/auto_epoch_settler.py")
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
PY
# ValueError: invalid literal for int() with base 10: bad

That undercuts the stated goal of making this daemon deployment-configurable and daemon-friendly. I would validate env vars explicitly, log a clear error, and fall back or exit from __main__ in a controlled way.

Additional validation:

.venv/bin/python -m py_compile node/auto_epoch_settler.py
passed

git diff --check HEAD~35..HEAD
failed

git diff --check reports trailing whitespace in unrelated files (faucet.py, node/machine_passport_api.py). The branch also changes 20 files and adds tools/comment-moderation-bot/test_logs/moderation_audit_2026-05-25.jsonl, which is unrelated to the M8 epoch-settler logging/config change. I recommend splitting the branch down to node/auto_epoch_settler.py plus focused tests for valid/invalid env config.

I received RTC compensation for this review.

@waefrebeorn
Copy link
Copy Markdown
Author

Will replace this with a clean branch (no unrelated files, trailing whitespace fixed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api API endpoint related BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) documentation Improvements or additions to documentation node Node server related size/L PR: 201-500 lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants