Fix #335 fix #2: parse-asymmetry fallback to per-target hash diff by tinder-maxwellelliott · Pull Request #353 · Tinder/bazel-diff

tinder-maxwellelliott · 2026-05-18T15:03:26Z

Summary

Implements the asymmetry-detection fallback described in #335 fix #2 and proposed in #352. Supersedes #352 — this branch contains the reproducer commit from #352 with the @Ignore annotations dropped, plus the fix that makes those tests pass.

The bug

When the two module-graph JSON payloads passed to get-impacted-targets are not byte-equal but one of them fails to parse, CalculateImpactedTargetsInteractor.detectChangedModules computed findChangedModules(emptyMap, fullMap) and reported every module in the parseable graph as "added". The impacted set then exploded in one of two ways:

With a BazelQueryService bound, every "added" module fanned out into an rdeps query against its canonical repo(s). On the workspace in Upgrading 18.0.x → 18.1.0 with a reused base graph triggers multi-hour queryTargetsDependingOnModules fan-out #335 this produced ~5,000 serial subprocesses and the run took multiple hours.
With no BazelQueryService bound (or one that errored), the failure-tolerant fallback returned allTargets.keys — every hashed label was reported as impacted, defeating the point of running bazel-diff.

Both outcomes are far worse than the per-target hash diff that would have run if no module graph had been supplied. #336 made the parser tolerant of stderr-polluted JSON, which covered the historical 18.0.x base graph case, but parse asymmetry remained a real failure mode for any genuinely unparseable input (corrupted base graphs pulled from object storage, truncation, a future bazel mod graph serialisation change).

The fix

bazel mod graph --output=json always emits at least the root module, so a parsed module-graph map that is empty really does mean parse failure — not "no modules". In detectChangedModules, when the two JSON strings differ but exactly one parsed graph is empty, return an empty changed-modules set instead of feeding the asymmetric pair into findChangedModules. Callers (execute and executeWithDistances) then fall through to computeSimpleImpactedTargets / computeAllDistances — a per-target hash diff that is bounded by the size of the hash set.

A warning is logged so an operator can attribute a fall-through back to a specific bad payload.

Tests

Two regression tests added (with @Ignore dropped relative to #352):

execute_parseAsymmetryFallsBackToSimpleHashDiff_regressionForIssue335Fix2 — exercises the default execute() path.
executeWithDistances_parseAsymmetryFallsBackToSimpleHashDiff_regressionForIssue335Fix2 — exercises the --depsFile path.

Both feed a fromModuleGraphJson with no { at all ("garbage-non-json-payload") so ModuleGraphParser.parseModuleGraph falls through to if (start < 0) return emptyMap(), paired with a toModuleGraphJson that parses cleanly to one or two modules and a hash pair where only //:changed actually changed. Both assert the impacted set is exactly {//:changed}.

Verification

bazel test //cli:CalculateImpactedTargetsInteractorTest — PASSED (21/21: 19 pre-existing + 2 new regression tests).
Confirmed the two new tests catch the bug by temporarily reverting just CalculateImpactedTargetsInteractor.kt — both failed with the expected "every target reported" assertion error.

Test plan

bazel test //cli:CalculateImpactedTargetsInteractorTest passes with the fix landed and @Ignore removed.
Reverted the production change locally — both new tests fail with the documented assertion, confirming the regression tests guard the fix.
Close Reproducer for #335 fix #2: parse-asymmetry should fall back to simple hash diff #352 in favour of this PR once merged.

🤖 Generated with Claude Code

@ignore

When the two module-graph JSON payloads passed to get-impacted-targets are not byte-equal but one of them fails to parse, findChangedModules(emptyMap, fullMap) reports every module in the populated graph as "added". The impacted set then explodes -- either fanning out into one rdeps subprocess per matched repo (the multi-hour fan-out in #335) or, when no BazelQueryService is bound, falling through to allTargets.keys so every hashed label is reported as impacted. Both outcomes are far worse than a per-target hash diff. PR #336 made the parser tolerant of stderr-polluted JSON, which covers the historical 18.0.x base graph case, but the asymmetry remains a real failure mode for genuinely unparseable input (corrupted base graphs, truncation, future serialization format changes). The reproducer adds two @ignore'd tests in CalculateImpactedTargetsInteractorTest covering both code paths that branch on changedModules.isNotEmpty(): execute() and executeWithDistances(). Both feed a non-JSON fromGraph and a parseable toGraph and assert that only the actually-hash-changed target appears in the impacted set. Today both tests fail (every target is reported); the @ignore'd state keeps CI green until fix #2 lands. Generated with Claude Code (https://claude.com/claude-code)

`bazel mod graph --output=json` always emits at least the root module, so a parsed module-graph map that is empty really does mean parse failure (a truncated/corrupted base graph from object storage, a future serialisation change, an old stderr-polluted 18.0.x capture from before #336). When the two JSON payloads are not byte-equal but exactly one side parses to empty, the historical behaviour was for `findChangedModules(emptyMap, fullMap)` to report every module on the parseable side as "added" and the impacted set exploded: * With a `BazelQueryService` bound, every "added" module fanned out into an `rdeps` query against its canonical repo(s). On the workspace in #335 that produced ~5,000 serial subprocesses and the run took multiple hours. * With no `BazelQueryService` bound (or one that errored), the failure-tolerant fallback path returned `allTargets.keys` -- every hashed label was reported as impacted, which on a large workspace defeats the point of running bazel-diff. The fix detects this asymmetry in `detectChangedModules` and returns an empty changed-modules set, so callers fall through to `computeSimpleImpactedTargets` (the per-target hash diff that would have run if no module graph had been supplied). A per-target hash diff is bounded by the size of the hash set; the module-fan-out path is not. The two regression tests added in the parent reproducer commit have their `@Ignore` annotations dropped and now pass: 21/21 tests in `CalculateImpactedTargetsInteractorTest` (19 pre-existing + 2 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

tinder-maxwellelliott and others added 2 commits May 18, 2026 10:54

tinder-maxwellelliott merged commit bafec20 into master May 18, 2026
15 checks passed

tinder-maxwellelliott deleted the claude/dazzling-rubin-3bbb87 branch May 18, 2026 16:46

BrewTestBot mentioned this pull request May 18, 2026

bazel-diff 23.0.0 Homebrew/homebrew-core#283416

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #335 fix #2: parse-asymmetry fallback to per-target hash diff#353

Fix #335 fix #2: parse-asymmetry fallback to per-target hash diff#353
tinder-maxwellelliott merged 2 commits into
masterfrom
claude/dazzling-rubin-3bbb87

tinder-maxwellelliott commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tinder-maxwellelliott commented May 18, 2026

Summary

The bug

The fix

Tests

Verification

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant