Fix join reorder dropping outer-join rows in OUTER JOIN + comma queries#101684
Conversation
|
cc @vdimir @novikd — could you review this? This fixes a crash in @vdimir this is particularly relevant since the regression was introduced by PR #99096 (recursive join optimization in |
|
Could someone add the |
|
@alexey-milovidov — thanks for labeling #101683! Could you also add This one fixes the recurring "Cannot fold actions for projection" crash (STID 3673-56a1) — a regression from #99096. Hit rate accelerated after #100643 (join order randomize) merged: 7 hits in the last 7 days (4 on Apr 5 alone), all stress tests across unrelated PRs. Root cause: #101690 fixes the "ColumnUnique can't contain null values" crash (STID 1503-2bec) — a 1-line guard in |
|
Workflow [PR], commit [65627b3] Summary: ✅
AI ReviewSummaryThis PR fixes a real Final VerdictStatus: ✅ Approve |
Fix for Fast test failuresThe previous commit removed the Updated fix (596beca): The Cross entry is now set conditionally — only when Pre-PR validation (session cron:clickhouse-ci-task-worker:20260407-134500)a) Deterministic repro: The original crash requires stress test + randomized join orders. Crash mechanism confirmed via static analysis: b) Root cause: When c) Fix matches root cause: The conditional check prevents the overwrite when a nested OUTER constraint already exists at index 0, while preserving Cross semantics for all other cases. d) Test intent preserved: All previously failing Fast tests pass:
e) Demonstrated in both directions: Cross join tests fail without the conditional (old complete removal), pass with conditional. OUTER+COMMA regression test passes with conditional (and would crash on trunk without any fix). f) Fix scope: Minimal 3-line change — single conditional guard. No sibling code paths affected (the |
| query_graph.join_kinds[0] = QueryGraph::OuterJoinRestriction{BitSet{}, BitSet{}, JoinKind::Cross}; | ||
| { | ||
| if (query_graph.join_kinds.find(0) == query_graph.join_kinds.end()) | ||
| query_graph.join_kinds[0] = QueryGraph::OuterJoinRestriction{BitSet{}, BitSet{}, JoinKind::Cross}; |
There was a problem hiding this comment.
Shouldn't we use different index than 0 here then?
There was a problem hiding this comment.
Yes. Keeping the marker on 0 only works when relation 0 has no other restriction, but the failing shape is exactly the case where relation 0 already carries the RIGHT/LEFT restriction. For t1 RIGHT JOIN t2 ON ... , t3, this branch skips the Cross marker, the optimizer first builds {0,1} as the required outer join, and the remaining unconnected join {0,1} x {2} is then seen by isValidJoinOrder as Inner because the left mask is no longer a single relation. buildPhysicalJoinImpl turns that into a constant-key inner join instead of a Cross join, so the Comma/Cross semantics that this patch is trying to preserve are still lost for the nested-outer case.
The Cross marker needs to be stored on a relation that is not already carrying an outer restriction, or represented separately from join_kinds, so preserving the outer constraint does not drop the surrounding Cross join kind.
A query that comma-joins (cross) an outer join with another table, e.g. `t1 RIGHT JOIN t2 ON t1.c = t2.c, t3`, could silently drop the unmatched rows of the RIGHT/LEFT join when join reordering was enabled (the default greedy algorithm). It returned the inner-join result instead of the outer-join one. Root cause: buildQueryGraph stored a Cross marker in join_kinds[0] for every Cross/Comma join. join_kinds is otherwise reserved for the OuterJoinRestriction of a null-supplying relation. When the left side of the cross was itself an outer join, this marker overwrote the relation-0 RIGHT/LEFT restriction set a few lines later. With the restriction gone, isValidJoinOrder returned Cross for the pair, which solveGreedy then downgraded to Inner because the pair was connected, turning the RIGHT/LEFT join into an INNER join and dropping the null-supplied rows. This is the same code path that previously produced the "Cannot fold actions for projection" logical error; that crash was fixed by ClickHouse#102516 (typeChangingSides guard), but the silent wrong-result for the non-type-changing case remained. Fix: stop multiplexing the cartesian signal into join_kinds. The Cross vs Inner distinction is already derivable from edge connectivity: a pair with no connecting predicate is a cross join. solveGreedy now sets Cross for an unconnected Inner pair instead of relying on the stored marker, so the outer restriction on relation 0 is preserved and Cross/Comma semantics are kept. Regression cases added to 04277_join_reorder_constraints. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1cedaa1 to
1564d50
Compare
|
Reworked per the review. The previous conditional approach was a half-fix: when relation 0 already carried a This version stops storing the The crash this PR originally targeted is already handled by #102516 ( Pre-PR validation gate
Session id: cron:clickhouse-worker-slot-0:20260610-072600 |
|
@vdimir ready for another look — the marker is no longer stored on any relation, so it can't collide with the outer-join restriction. Thanks! |
The cartesian (implicit comma) join in Case 13a is now labeled Type: CROSS in EXPLAIN actions output instead of Type: INNER with a synthetic Clauses: [(__lhs_const) = (__rhs_const)] line. This follows from deriving cartesian-ness from edge connectivity in solveGreedy rather than storing a Cross marker in join_kinds. Data results are unchanged; only the plan label differs. Drop the obsolete synthetic-clause line from the reference. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Pushed Root cause: the test was not modified by this PR, but its Verified locally (debug,
|
|
CI finished on
The only PR-caused failure was the earlier Session: cron:our-pr-ci-monitor:20260610-143000 |
|
CI fully finished on HEAD 6c0a93a (the 2026-06-10 master-merge). Finish Workflow passed; all checks completed. Two failures, both pre-existing and unrelated to this PR's join-reorder diff (
|
we cannot merge with red ci |
…r-comma-constraint
Thanks @vdimir. Both red checks on the previous head were pre-existing flaky failures unrelated to this diff (which only touches
I merged current master (65627b3) to re-trigger CI. Will confirm once it is green. |
LLVM Coverage Report
Changed lines: Changed C/C++ lines covered by tests: 10/10 (100.00%) | Lost baseline coverage: none · Uncovered code |

Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix join reordering silently dropping unmatched rows of a
RIGHT/LEFT JOINwhen it is comma-joined (cross) with another table, e.g.t1 RIGHT JOIN t2 ON t1.c = t2.c, t3. The query previously returned the inner-join result instead of the outer-join one.Motivation
buildQueryGraphstored aCrossmarker injoin_kinds[0]for every Cross/Comma join.join_kindsis otherwise reserved for theOuterJoinRestrictionof a null-supplying relation. When the left side of the cross was itself an outer join, this marker overwrote the relation-0RIGHT/LEFTrestriction. With the restriction gone,isValidJoinOrderreturnedCrossfor the pair, whichsolveGreedydowngraded toInner(the pair is connected), turning the outer join into an inner join and dropping the null-supplied rows.This is the same code path that produced the
Cannot fold actions for projectionlogical error (STID 3673-56a1). The crash was fixed by #102516 (thetypeChangingSidesguard), but the silent wrong-result for the non-type-changing case remained.Approach
Stop multiplexing the cartesian signal into
join_kinds. Cross vs Inner is already derivable from edge connectivity: a pair with no connecting predicate is a cross join.solveGreedynow assignsCrossto an unconnectedInnerpair instead of relying on the stored marker, so the outer restriction on relation 0 survives and Cross/Comma semantics are preserved.Addresses @vdimir's review: the
Crosskind is no longer stored on relation 0 (or any relation), so it never collides with an outer-join restriction.Closes: #103045
Test plan
04277_join_reorder_constraints(RIGHT/LEFT JOIN + comma). They fail on master HEAD and pass with this change.03604,03760,03822,04075,04095,04277,04303).00202_cross_join,01109_inflating_cross_join,03595,03302,03701,01236_graphite_mt.Documentation entry for user-facing changes
Version info
26.6.1.682