fix: tolerate stderr-polluted moduleGraphJson in ModuleGraphParser by rdark · Pull Request #336 · Tinder/bazel-diff · GitHub
Skip to content

fix: tolerate stderr-polluted moduleGraphJson in ModuleGraphParser#336

Merged
tinder-maxwellelliott merged 1 commit into
Tinder:masterfrom
rdark:fix/stderr-pollution-tolerant-parse
May 13, 2026
Merged

fix: tolerate stderr-polluted moduleGraphJson in ModuleGraphParser#336
tinder-maxwellelliott merged 1 commit into
Tinder:masterfrom
rdark:fix/stderr-pollution-tolerant-parse

Conversation

@rdark

@rdark rdark commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Partial fix for #335

bazel-diff 17.0.1..18.0.5 configured BazelModService.getModuleGraphJson() with stdout=CAPTURE, stderr=CAPTURE. In Process.kt, captureAll triggers ProcessBuilder.redirectErrorStream(true), physically merging stderr into stdout. The captured moduleGraphJson therefore contained bazel's INFO lines ("INFO: Invocation ID: ...", "Loading: 0 packages loaded", etc.) prepended to the actual JSON output.

PR #330 (shipped in v18.1.0) correctly switched stderr to SILENT but broke format compatibility: any CI pipeline re-using a base graph from 17.0.1..18.0.5 and a head graph from 18.1.0+ hits parseModuleGraph's try/catch on the polluted side, gets emptyMap(), and cascades into findChangedModules reporting every head-side module as "added". The downstream queryTargetsDependingOnModules then spawns thousands of bazel query rdeps(...) subprocesses.

Make parseModuleGraph tolerant: attempt the whole-string parse first, and on failure retry from the first '{' to end-of-string. On second failure fall through to the existing emptyMap() behaviour. Clean input continues to parse in one attempt with no extra allocation.

Tests:

  • ModuleGraphParserTest: positive case for stderr-prefixed input.
  • StderrPollutionRegressionTest: end-to-end regression guard covering parse, findChangedModules, and the CalculateImpactedTargetsInteractor dispatch path. Confirms the cross-version comparison now short-circuits to computeSimpleImpactedTargets instead of the rdeps fan-out.

bazel-diff 17.0.1..18.0.5 configured BazelModService.getModuleGraphJson()
with stdout=CAPTURE, stderr=CAPTURE. In Process.kt, captureAll triggers
ProcessBuilder.redirectErrorStream(true), physically merging stderr into
stdout. The captured moduleGraphJson therefore contained bazel's INFO
lines ("INFO: Invocation ID: ...", "Loading: 0 packages loaded", etc.)
prepended to the actual JSON output.

PR Tinder#330 (shipped in v18.1.0) correctly switched stderr to SILENT but
broke format compatibility: any CI pipeline re-using a base graph from
17.0.1..18.0.5 and a head graph from 18.1.0+ hits parseModuleGraph's
try/catch on the polluted side, gets emptyMap(), and cascades into
findChangedModules reporting every head-side module as "added". The
downstream queryTargetsDependingOnModules then spawns thousands of
bazel query rdeps(...) subprocesses.

Make parseModuleGraph tolerant: attempt the whole-string parse first,
and on failure retry from the first '{' to end-of-string. On second
failure fall through to the existing emptyMap() behaviour. Clean input
continues to parse in one attempt with no extra allocation.

Tests:
- ModuleGraphParserTest: positive case for stderr-prefixed input.
- StderrPollutionRegressionTest: end-to-end regression guard covering
  parse, findChangedModules, and the CalculateImpactedTargetsInteractor
  dispatch path. Confirms the cross-version comparison now short-circuits
  to computeSimpleImpactedTargets instead of the rdeps fan-out.
@rdark rdark force-pushed the fix/stderr-pollution-tolerant-parse branch from 65f988e to a3cb64f Compare April 27, 2026 23:33

@tinder-maxwellelliott tinder-maxwellelliott left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@tinder-maxwellelliott tinder-maxwellelliott merged commit dc81936 into Tinder:master May 13, 2026
15 checks passed
tinder-maxwellelliott added a commit that referenced this pull request May 18, 2026
* Reproducer for #335 fix #2: parse-asymmetry fallback to simple hash diff

When the two module-graph JSON payloads passed to get-impacted-targets are
not byte-equal but one of them fails to parse, findChangedModules(emptyMap,
fullMap) reports every module in the populated graph as "added". The
impacted set then explodes -- either fanning out into one rdeps subprocess
per matched repo (the multi-hour fan-out in #335) or, when no
BazelQueryService is bound, falling through to allTargets.keys so every
hashed label is reported as impacted.

Both outcomes are far worse than a per-target hash diff. PR #336 made the
parser tolerant of stderr-polluted JSON, which covers the historical
18.0.x base graph case, but the asymmetry remains a real failure mode for
genuinely unparseable input (corrupted base graphs, truncation, future
serialization format changes).

The reproducer adds two @ignore'd tests in CalculateImpactedTargetsInteractorTest
covering both code paths that branch on changedModules.isNotEmpty():
execute() and executeWithDistances(). Both feed a non-JSON fromGraph and a
parseable toGraph and assert that only the actually-hash-changed target
appears in the impacted set. Today both tests fail (every target is
reported); the @ignore'd state keeps CI green until fix #2 lands.

Generated with Claude Code (https://claude.com/claude-code)

* Fix #335 fix #2: fall back to per-target hash diff on parse asymmetry

`bazel mod graph --output=json` always emits at least the root module, so a
parsed module-graph map that is empty really does mean parse failure (a
truncated/corrupted base graph from object storage, a future serialisation
change, an old stderr-polluted 18.0.x capture from before #336). When the
two JSON payloads are not byte-equal but exactly one side parses to empty,
the historical behaviour was for `findChangedModules(emptyMap, fullMap)` to
report every module on the parseable side as "added" and the impacted set
exploded:

  * With a `BazelQueryService` bound, every "added" module fanned out into
    an `rdeps` query against its canonical repo(s). On the workspace in
    #335 that produced ~5,000 serial subprocesses and the run took
    multiple hours.
  * With no `BazelQueryService` bound (or one that errored), the
    failure-tolerant fallback path returned `allTargets.keys` -- every
    hashed label was reported as impacted, which on a large workspace
    defeats the point of running bazel-diff.

The fix detects this asymmetry in `detectChangedModules` and returns an
empty changed-modules set, so callers fall through to
`computeSimpleImpactedTargets` (the per-target hash diff that would have
run if no module graph had been supplied). A per-target hash diff is
bounded by the size of the hash set; the module-fan-out path is not.

The two regression tests added in the parent reproducer commit have their
`@Ignore` annotations dropped and now pass: 21/21 tests in
`CalculateImpactedTargetsInteractorTest` (19 pre-existing + 2 new).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants