Merge expressions into join during join reordering#98533
Conversation
1ccd131 to
7cba228
Compare
7cba228 to
19e7334
Compare
19e7334 to
16d7279
Compare
|
|
||
| SET enable_parallel_replicas = 0; | ||
| SET query_plan_join_swap_table = 'auto'; | ||
| SET query_plan_optimize_join_order_limit = 64; |
There was a problem hiding this comment.
Nice coverage for the enabled path. To reduce regression risk for the new switch, please add one companion case with query_plan_merge_expression_into_join = 0 on the same query shape (plan + result), so both enabled and disabled semantics are locked by tests.
There was a problem hiding this comment.
I re-checked the current 04001 test and still don't see a companion case with query_plan_merge_expression_into_join = 0 for the same query shape.
Since this PR introduces a switchable behavior, we should lock both branches (1 and 0) with plan+result checks to prevent regressions where the disabled path drifts silently.
c3d87b4 to
6e16799
Compare
f6bfae3 to
73d2ea8
Compare
286d053 to
430261c
Compare
| { | ||
| bool merge_expression_into_join = query_graph.context->optimization_settings.merge_expression_into_join; | ||
| auto * expr = typeid_cast<ExpressionStep *>(check->step.get()); | ||
| if (expr && check->children.size() == 1 && !expr->getExpression().hasArrayJoin() |
There was a problem hiding this comment.
I think checking that it has a single child is unnecessary and confusing, as ExpressionStep can only have one child, if I am not mistaken. Or do I miss something here?
| if (expr && check->children.size() == 1 && !expr->getExpression().hasArrayJoin() | |
| if (expr && !expr->getExpression().hasArrayJoin() |
There was a problem hiding this comment.
ExpressionStep is logically unary, but in this optimizer path we dereference node->children[0] and may run before all structural assumptions are guaranteed. Keeping node->children.size() == 1 here is a defensive guard against malformed/intermediate plan nodes and avoids an avoidable exception in this hot path.
| /// have the correct row count. When all post-join outputs are constants | ||
| /// (e.g. `__join_result_dummy`), residual_dag has no required inputs, | ||
| /// and the join would produce blocks with 0 columns | ||
| if (!left_dag.getOutputs().empty()) |
There was a problem hiding this comment.
If actions_after_join contains only COLUMN constants (e.g. SELECT 1 FROM t1 CROSS JOIN t2 after reorder), residual_dag.getRequiredColumnsNames() is empty and this fallback takes left_dag.getOutputs().front()->result_name. For that shape, the first output can be __join_result_dummy (a fromNone constant), which is not present in left/right input columns. Then TableJoin::setUsedColumns throws NOT_FOUND_COLUMN_IN_BLOCK instead of executing the query.
Please pick a guaranteed real input column (from left_dag.getInputs()/right_dag.getInputs() or child headers) rather than an arbitrary output node.
There was a problem hiding this comment.
This still looks live in the current code.
chooseJoinOrder now explicitly appends COLUMN constants (for example __join_result_dummy) to actions_after_join at optimizeJoin.cpp:1263-1267. In buildPhysicalJoinImpl, when residual_dag.getRequiredColumnsNames() is empty, the fallback still picks left_dag.getOutputs().front()->result_name (JoinStepLogical.cpp:1295-1297). Because used_expressions is seeded from actions_after_join first (JoinStepLogical.cpp:1207-1221), that front output can be the fromNone constant instead of a real input column.
Then TableJoin::setUsedColumns receives a name that is not present in either side and throws NOT_FOUND_COLUMN_IN_BLOCK.
Could we change this fallback to take a guaranteed real input from left_dag.getInputs() / right_dag.getInputs() (or child headers), and skip COLUMN nodes here?
|
The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix. |
|
The MSan stress test failure (MemorySanitizer: use-of-uninitialized-value, STID 4179-5154 or 4148-3044) is a known pre-existing issue unrelated to this PR. Fix: #102158 |
3658658 to
78fbe40
Compare
| if (expr && !expr->getExpression().hasArrayJoin() | ||
| && (isPassthroughActions(expr->getExpression()) || merge_expression_into_join)) | ||
| { | ||
| check = check->children[0]; |
There was a problem hiding this comment.
This branch unwraps ExpressionStep with check = check->children[0], but unlike addChildQueryGraph above it does not guard check->children.size() == 1 first.
In this optimizer path we can still see intermediate/malformed nodes, and operator[] here is unchecked. A zero-child ExpressionStep would trigger undefined behavior before we fail with a proper exception.
Please add the same defensive arity check used in addChildQueryGraph before dereferencing children[0].
|
📊 Cloud Performance Report ✅ AI verdict: This PR adds a new join-reordering optimization (query_plan_merge_expression_into_join) that merges Expression steps into JOIN steps and reworks how outer-join ON-clause predicates are pinned during reorder. Both flagged queries are ClickBench Q4 and Q15, which run against a single table with no joins, so the changed code path is never exercised by them. The 18% and 17% improvements are large and consistent in the measurements but cannot plausibly be caused by a join-only optimizer change, so they are downgraded as run-to-run variance rather than real PR effects. clickbenchFlagged queries (2 of 43)
q-value = BH-FDR adjusted p; smaller is stronger evidence. MIRAI flags a query when q < fdr_q (default 0.10) — the value the verdict is based on. tpch_adapted_1_official🟢 No significant changes Debug info
|
| /// Include COLUMN Const nodes (like `__join_result_dummy`) in actions_after_join. | ||
| /// These are `fromNone` and will be placed in left_dag, ensuring the left pre-join | ||
| /// always produces at least one output column (so blocks have correct row count). | ||
| if (mapped_it->second->type == ActionsDAG::ActionType::COLUMN) |
There was a problem hiding this comment.
This makes final COLUMN constants part of actions_after_join, so buildPhysicalJoinImpl treats them as left-side inputs (fromNone goes into left_dag). For RIGHT/FULL joins, unmatched rows are produced by default/null-extending the left side, so a query like SELECT 1 FROM l RIGHT JOIN r ... can return the default/null-extended value for the constant instead of evaluating 1 after the join.
Please keep final constants in the post-join projection/residual DAG and use a guaranteed real input column only for the row-count keeper column.
There was a problem hiding this comment.
If we do not put predicate into query_graph.outer_join_conditions it automatically goes into residual filters then
|
@groeneai check out fuzzer and stress test issues and create repro if it's new ones
|
|
@vdimir Triaged all three. All are pre-existing trunk failures, none caused by the expression-merge / join-reordering change, so no new repro is needed. Evidence is 30d CIDB (cross-PR + master) + the exact crash sites. 1. STID 1941-1bfa 2. STID 2508-2ef2 3. STID 0993-27f0 |
LLVM Coverage ReportChanged lines: Changed C/C++ lines covered: 269/279 (96.42%) · Uncovered code |

Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
query_plan_merge_expression_into_joinsetting to allow merging Expression steps into JOIN steps during join reordering optimization. This enables join reordering across subqueries that wrap joins (e.g., when a JOIN is inside a subquery with computed columns), leading to better optimization of complex join trees.