iframe-proxy

rdark · 2026-04-27T23:29:36Z

Partial fix for #335

bazel-diff 17.0.1..18.0.5 configured BazelModService.getModuleGraphJson() with stdout=CAPTURE, stderr=CAPTURE. In Process.kt, captureAll triggers ProcessBuilder.redirectErrorStream(true), physically merging stderr into stdout. The captured moduleGraphJson therefore contained bazel's INFO lines ("INFO: Invocation ID: ...", "Loading: 0 packages loaded", etc.) prepended to the actual JSON output.

PR #330 (shipped in v18.1.0) correctly switched stderr to SILENT but broke format compatibility: any CI pipeline re-using a base graph from 17.0.1..18.0.5 and a head graph from 18.1.0+ hits parseModuleGraph's try/catch on the polluted side, gets emptyMap(), and cascades into findChangedModules reporting every head-side module as "added". The downstream queryTargetsDependingOnModules then spawns thousands of bazel query rdeps(...) subprocesses.

Make parseModuleGraph tolerant: attempt the whole-string parse first, and on failure retry from the first '{' to end-of-string. On second failure fall through to the existing emptyMap() behaviour. Clean input continues to parse in one attempt with no extra allocation.

Tests:

ModuleGraphParserTest: positive case for stderr-prefixed input.
StderrPollutionRegressionTest: end-to-end regression guard covering parse, findChangedModules, and the CalculateImpactedTargetsInteractor dispatch path. Confirms the cross-version comparison now short-circuits to computeSimpleImpactedTargets instead of the rdeps fan-out.

bazel-diff 17.0.1..18.0.5 configured BazelModService.getModuleGraphJson() with stdout=CAPTURE, stderr=CAPTURE. In Process.kt, captureAll triggers ProcessBuilder.redirectErrorStream(true), physically merging stderr into stdout. The captured moduleGraphJson therefore contained bazel's INFO lines ("INFO: Invocation ID: ...", "Loading: 0 packages loaded", etc.) prepended to the actual JSON output. PR Tinder#330 (shipped in v18.1.0) correctly switched stderr to SILENT but broke format compatibility: any CI pipeline re-using a base graph from 17.0.1..18.0.5 and a head graph from 18.1.0+ hits parseModuleGraph's try/catch on the polluted side, gets emptyMap(), and cascades into findChangedModules reporting every head-side module as "added". The downstream queryTargetsDependingOnModules then spawns thousands of bazel query rdeps(...) subprocesses. Make parseModuleGraph tolerant: attempt the whole-string parse first, and on failure retry from the first '{' to end-of-string. On second failure fall through to the existing emptyMap() behaviour. Clean input continues to parse in one attempt with no extra allocation. Tests: - ModuleGraphParserTest: positive case for stderr-prefixed input. - StderrPollutionRegressionTest: end-to-end regression guard covering parse, findChangedModules, and the CalculateImpactedTargetsInteractor dispatch path. Confirms the cross-version comparison now short-circuits to computeSimpleImpactedTargets instead of the rdeps fan-out.

tinder-maxwellelliott

Thanks!

@ignore

* Reproducer for #335 fix #2: parse-asymmetry fallback to simple hash diff When the two module-graph JSON payloads passed to get-impacted-targets are not byte-equal but one of them fails to parse, findChangedModules(emptyMap, fullMap) reports every module in the populated graph as "added". The impacted set then explodes -- either fanning out into one rdeps subprocess per matched repo (the multi-hour fan-out in #335) or, when no BazelQueryService is bound, falling through to allTargets.keys so every hashed label is reported as impacted. Both outcomes are far worse than a per-target hash diff. PR #336 made the parser tolerant of stderr-polluted JSON, which covers the historical 18.0.x base graph case, but the asymmetry remains a real failure mode for genuinely unparseable input (corrupted base graphs, truncation, future serialization format changes). The reproducer adds two @ignore'd tests in CalculateImpactedTargetsInteractorTest covering both code paths that branch on changedModules.isNotEmpty(): execute() and executeWithDistances(). Both feed a non-JSON fromGraph and a parseable toGraph and assert that only the actually-hash-changed target appears in the impacted set. Today both tests fail (every target is reported); the @ignore'd state keeps CI green until fix #2 lands. Generated with Claude Code (https://claude.com/claude-code) * Fix #335 fix #2: fall back to per-target hash diff on parse asymmetry `bazel mod graph --output=json` always emits at least the root module, so a parsed module-graph map that is empty really does mean parse failure (a truncated/corrupted base graph from object storage, a future serialisation change, an old stderr-polluted 18.0.x capture from before #336). When the two JSON payloads are not byte-equal but exactly one side parses to empty, the historical behaviour was for `findChangedModules(emptyMap, fullMap)` to report every module on the parseable side as "added" and the impacted set exploded: * With a `BazelQueryService` bound, every "added" module fanned out into an `rdeps` query against its canonical repo(s). On the workspace in #335 that produced ~5,000 serial subprocesses and the run took multiple hours. * With no `BazelQueryService` bound (or one that errored), the failure-tolerant fallback path returned `allTargets.keys` -- every hashed label was reported as impacted, which on a large workspace defeats the point of running bazel-diff. The fix detects this asymmetry in `detectChangedModules` and returns an empty changed-modules set, so callers fall through to `computeSimpleImpactedTargets` (the per-target hash diff that would have run if no module graph had been supplied). A per-target hash diff is bounded by the size of the hash set; the module-fan-out path is not. The two regression tests added in the parent reproducer commit have their `@Ignore` annotations dropped and now pass: 21/21 tests in `CalculateImpactedTargetsInteractorTest` (19 pre-existing + 2 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rdark force-pushed the fix/stderr-pollution-tolerant-parse branch from 65f988e to a3cb64f Compare April 27, 2026 23:33

tinder-maxwellelliott approved these changes May 13, 2026

View reviewed changes

tinder-maxwellelliott merged commit dc81936 into Tinder:master May 13, 2026
15 checks passed

This was referenced May 16, 2026

Reproducer for #335 fix #2: parse-asymmetry should fall back to simple hash diff #352

Closed

Fix #335 fix #2: parse-asymmetry fallback to per-target hash diff #353

Merged

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: tolerate stderr-polluted moduleGraphJson in ModuleGraphParser#336

fix: tolerate stderr-polluted moduleGraphJson in ModuleGraphParser#336
tinder-maxwellelliott merged 1 commit into
Tinder:masterfrom
rdark:fix/stderr-pollution-tolerant-parse

rdark commented Apr 27, 2026

Uh oh!

tinder-maxwellelliott left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

rdark commented Apr 27, 2026

Uh oh!

tinder-maxwellelliott left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants