Skip to content

Fix #335 fix #2: parse-asymmetry fallback to per-target hash diff#353

Merged
tinder-maxwellelliott merged 2 commits into
masterfrom
claude/dazzling-rubin-3bbb87
May 18, 2026
Merged

Fix #335 fix #2: parse-asymmetry fallback to per-target hash diff#353
tinder-maxwellelliott merged 2 commits into
masterfrom
claude/dazzling-rubin-3bbb87

Conversation

@tinder-maxwellelliott
Copy link
Copy Markdown
Collaborator

Summary

Implements the asymmetry-detection fallback described in #335 fix #2 and proposed in #352. Supersedes #352 — this branch contains the reproducer commit from #352 with the @Ignore annotations dropped, plus the fix that makes those tests pass.

The bug

When the two module-graph JSON payloads passed to get-impacted-targets are not byte-equal but one of them fails to parse, CalculateImpactedTargetsInteractor.detectChangedModules computed findChangedModules(emptyMap, fullMap) and reported every module in the parseable graph as "added". The impacted set then exploded in one of two ways:

Both outcomes are far worse than the per-target hash diff that would have run if no module graph had been supplied. #336 made the parser tolerant of stderr-polluted JSON, which covered the historical 18.0.x base graph case, but parse asymmetry remained a real failure mode for any genuinely unparseable input (corrupted base graphs pulled from object storage, truncation, a future bazel mod graph serialisation change).

The fix

bazel mod graph --output=json always emits at least the root module, so a parsed module-graph map that is empty really does mean parse failure — not "no modules". In detectChangedModules, when the two JSON strings differ but exactly one parsed graph is empty, return an empty changed-modules set instead of feeding the asymmetric pair into findChangedModules. Callers (execute and executeWithDistances) then fall through to computeSimpleImpactedTargets / computeAllDistances — a per-target hash diff that is bounded by the size of the hash set.

A warning is logged so an operator can attribute a fall-through back to a specific bad payload.

Tests

Two regression tests added (with @Ignore dropped relative to #352):

  • execute_parseAsymmetryFallsBackToSimpleHashDiff_regressionForIssue335Fix2 — exercises the default execute() path.
  • executeWithDistances_parseAsymmetryFallsBackToSimpleHashDiff_regressionForIssue335Fix2 — exercises the --depsFile path.

Both feed a fromModuleGraphJson with no { at all ("garbage-non-json-payload") so ModuleGraphParser.parseModuleGraph falls through to if (start < 0) return emptyMap(), paired with a toModuleGraphJson that parses cleanly to one or two modules and a hash pair where only //:changed actually changed. Both assert the impacted set is exactly {//:changed}.

Verification

  • bazel test //cli:CalculateImpactedTargetsInteractorTestPASSED (21/21: 19 pre-existing + 2 new regression tests).
  • Confirmed the two new tests catch the bug by temporarily reverting just CalculateImpactedTargetsInteractor.kt — both failed with the expected "every target reported" assertion error.

Test plan

🤖 Generated with Claude Code

tinder-maxwellelliott and others added 2 commits May 18, 2026 10:54
When the two module-graph JSON payloads passed to get-impacted-targets are
not byte-equal but one of them fails to parse, findChangedModules(emptyMap,
fullMap) reports every module in the populated graph as "added". The
impacted set then explodes -- either fanning out into one rdeps subprocess
per matched repo (the multi-hour fan-out in #335) or, when no
BazelQueryService is bound, falling through to allTargets.keys so every
hashed label is reported as impacted.

Both outcomes are far worse than a per-target hash diff. PR #336 made the
parser tolerant of stderr-polluted JSON, which covers the historical
18.0.x base graph case, but the asymmetry remains a real failure mode for
genuinely unparseable input (corrupted base graphs, truncation, future
serialization format changes).

The reproducer adds two @ignore'd tests in CalculateImpactedTargetsInteractorTest
covering both code paths that branch on changedModules.isNotEmpty():
execute() and executeWithDistances(). Both feed a non-JSON fromGraph and a
parseable toGraph and assert that only the actually-hash-changed target
appears in the impacted set. Today both tests fail (every target is
reported); the @ignore'd state keeps CI green until fix #2 lands.

Generated with Claude Code (https://claude.com/claude-code)
`bazel mod graph --output=json` always emits at least the root module, so a
parsed module-graph map that is empty really does mean parse failure (a
truncated/corrupted base graph from object storage, a future serialisation
change, an old stderr-polluted 18.0.x capture from before #336). When the
two JSON payloads are not byte-equal but exactly one side parses to empty,
the historical behaviour was for `findChangedModules(emptyMap, fullMap)` to
report every module on the parseable side as "added" and the impacted set
exploded:

  * With a `BazelQueryService` bound, every "added" module fanned out into
    an `rdeps` query against its canonical repo(s). On the workspace in
    #335 that produced ~5,000 serial subprocesses and the run took
    multiple hours.
  * With no `BazelQueryService` bound (or one that errored), the
    failure-tolerant fallback path returned `allTargets.keys` -- every
    hashed label was reported as impacted, which on a large workspace
    defeats the point of running bazel-diff.

The fix detects this asymmetry in `detectChangedModules` and returns an
empty changed-modules set, so callers fall through to
`computeSimpleImpactedTargets` (the per-target hash diff that would have
run if no module graph had been supplied). A per-target hash diff is
bounded by the size of the hash set; the module-fan-out path is not.

The two regression tests added in the parent reproducer commit have their
`@Ignore` annotations dropped and now pass: 21/21 tests in
`CalculateImpactedTargetsInteractorTest` (19 pre-existing + 2 new).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tinder-maxwellelliott tinder-maxwellelliott merged commit bafec20 into master May 18, 2026
15 checks passed
@tinder-maxwellelliott tinder-maxwellelliott deleted the claude/dazzling-rubin-3bbb87 branch May 18, 2026 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant