Skip to content

fix(validator): bound tree-sitter parse to prevent scoring-round DoS#1276

Open
anderdc wants to merge 1 commit into
testfrom
fix/tree-sitter-parser-timeout
Open

fix(validator): bound tree-sitter parse to prevent scoring-round DoS#1276
anderdc wants to merge 1 commit into
testfrom
fix/tree-sitter-parser-timeout

Conversation

@anderdc
Copy link
Copy Markdown
Collaborator

@anderdc anderdc commented May 14, 2026

Summary

  • Sets parser.timeout_micros = 2_000_000 on every cached tree-sitter parser so adversarial PR file contents cannot hang the scoring round in C.
  • Wraps each PR in score_miner_prs with a try/except so one bad PR cannot abort the rest of a UID's scoring loop.

Why

gittensor/validator/utils/tree_sitter_scoring.py calls parser.parse(content.encode('utf-8')) with no timeout. Several bundled grammars in tree-sitter-language-pack==0.7.2 have error-recovery loops that, on adversarial input, spin forever in C while holding the GIL — including a known 16-byte TSX payload (tree-sitter-typescript#323). The try/except Exception already inside parse_code does not help here because a C-level hang never raises a Python exception.

The unguarded path runs synchronously inside the validator's scoring round:

get_rewards → evaluate_miners_pull_requests → score_miner_prs → score_pr
  → calculate_base_score_for_pr_files → calculate_token_score_from_file_changes
  → score_tree_diff → parse_code → parser.parse(...)   # hang point

A miner can open a PR (no merge required — OPEN PRs are scored) containing a single pathological file in any master_repositories.json repo. Every validator re-fetches that PR each round for ~PR_LOOKBACK_DAYS=35 and hangs, blocking weight setting.

What this PR does (and doesn't)

Setting parser.timeout_micros makes the C code raise ValueError('Parsing failed') back into Python after 2s. The existing wrapper in parse_code already catches Exception and returns None, which score_tree_diff already handles (the file degrades to a tree-diff with empty signatures, score 0). 2s is well above the millisecond cost of real files.

The per-PR try/except in score_miner_prs is defense-in-depth: any future exception in PR scoring degrades that PR to score 0 instead of aborting the rest of the loop.

Scope intentionally minimal — does not include subprocess isolation. One known timeout-immune class remains (ts_subtree_balance hangs, tree-sitter#4019, one known input on .scala); that requires an external wall-clock and can be addressed in a follow-up if it appears in the wild.

Test plan

  • Existing test suite passes (uv run pytest tests/ → 753 passed).
  • Pre-commit (ruff, ruff-format, pyright) clean on touched files.
  • Smoke-tested the 16-byte TSX payload from tree-sitter-typescript#323:
    parse_code('<a {{b:>c:d(e f)', 'tsx') returns None in 2.00s (previously: infinite hang).
  • Verified parser.timeout_micros = 2000000 is set on cached parsers.

…ures

Sets parser.timeout_micros (2s) on every cached tree-sitter parser so
adversarial inputs cannot hang the scoring round in C, and wraps each
PR in score_miner_prs with a try/except so one bad PR cannot abort the
UID's scoring loop.

Without the timeout, parser.parse() can spin forever in tree-sitter's
error-recovery paths on inputs as small as 16 bytes, holding the GIL
and preventing the round from completing. The timeout makes the C code
raise ValueError, which the existing parse_code wrapper already catches
and converts to a None tree (handled as score=0 downstream).
@xiao-xiao-mao xiao-xiao-mao Bot added the bug Something isn't working label May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant