{{ message }}
Cherry pick #105041 to 26.3: Fix LOGICAL_ERROR for reused materialized CTE filtered by IN over another materialized CTE#106637
Closed
robot-ch-test-poll4 wants to merge 35 commits into
Closed
Conversation
The optimization step is moved one level up, into a dedicated first traversal of resolveMaterializingCTEs introduced in the next commit. This lets the recursive buildSetInplace / buildOrderedSetInplace path win the inner safety-net DelayedMaterializingCTEsStep's claim before the outer step attempts its own claim. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first traversal optimizes every materialized CTE's pre-built plan and reports whether any DelayedMaterializingCTEsStep exists in the tree. The second traversal -- skipped entirely when the first saw none -- replaces DelayedMaterializingCTEsStep nodes with MaterializingCTEsStep, attaching plans for any CTEs not already claimed inplace. Why this fixes the LOGICAL_ERROR -------------------------------- Today makePlansForCTEs did three things atomically: it exchange'd is_materialization_planned, it called materialized_cte->plan->optimize, and it std::moved the plan into the synthesized MaterializingCTEsStep. The outer DelayedMaterializingCTEsStep[ct] in the main plan won the exchange before descending into [rs]; by the time rs.plan->optimize fired buildSetInplace via KeyCondition or VirtualColumnUtils, the inner safety-net DelayedMaterializingCTEsStep inside the IN-subquery's plan found ct.is_materialization_planned already true and degraded to a pass-through. The synchronously executed pipeline read ct.storage with is_built == false and threw. With two traversals, the optimize calls happen first across the entire tree, so the recursive buildSetInplace path gets to run makePlansForCTEs on the inner safety-net step before the outer step's claim is attempted. The outer step then sees the CTE already claimed and degrades to a harmless pass-through. For queries with no materialized CTEs at all (the common case), the first traversal walks the tree without doing any work, reports no DelayedMaterializingCTEsStep, and returns -- keeping the total work at one walk just like before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…er CTE Covers the LOGICAL_ERROR path that fired when the outer DelayedMaterializingCTEsStep claimed the inner CTE before buildSetInplace had a chance to materialize it synchronously for primary-key / virtual-column analysis. Three query shapes: ANY LEFT JOIN on PK, cross-join on PK, ANY LEFT JOIN with the IN filter on a non-PK column (exercising the VirtualColumnUtils path). EXPLAIN is also checked because the bug fired during plan optimization -- the test only asserts that EXPLAIN produced at least one row so the reference file is not pinned to a specific plan shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… Expressions The two-traversal refactor in resolveMaterializingCTEs shifts the timing of `materialized_cte->plan->optimize` such that mergeExpressions now successfully collapses two adjacent ExpressionSteps in the materialized CTE's local Union branch (Project names + Change column names | Projection + Change column names to column identifiers). Functional results are unchanged (count() = 3 twice). The merged ExpressionStep performs identical work in a single transform. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After the two-traversal split, makePlansForCTEs no longer calls plan->optimize -- that happens in DelayedMaterializingCTEsStep::optimizePlans during the first traversal. Drop the now-unused QueryPlanOptimizationSettings argument from the public method and from the local addPlansForMaterializingCTEs helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A small utility to splice out a top-level `DelayedMaterializingCTEsStep` from a plan, replacing it with its single original child. Used by the upcoming change to `DelayedCreatingSetsStep::makePlansForSets` to strip the safety-net materialization step from set plans being attached for runtime execution. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a set's plan is being attached for runtime construction (via DelayedCreatingSetsStep → CreatingSetsStep), the safety-net DelayedMaterializingCTEsStep planted by forceMaterializeCTE is no longer needed: at runtime the outer MaterializingCTEsStep in the main plan materializes the CTE before the CreatingSetsStep's pipeline runs. The strip is non-recursive on purpose. Nested set subquery plans (reached via DelayedCreatingSetsStep inside this plan) may still be consumed by buildSetInplace / buildOrderedSetInplace during the subsequent plan->optimize() and need their safety-nets to claim. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous order claimed materialized-CTE plans at the outer level during resolveMaterializingCTEs before addStepsToBuildSets had a chance to trigger synchronous buildSetInplace / buildOrderedSetInplace via the recursive plan->optimize. When that synchronous inplace path then ran on a deeply nested IN-subquery filtered by a MergeTree primary key, the safety-net DelayedMaterializingCTEsStep inside the inplace pipeline saw the CTE's plan already moved out and degraded to a pass-through, producing LOGICAL_ERROR while reading the still-empty StorageMemory. Swapping the order lets the inplace path claim the CTE via the safety-net first; resolveMaterializingCTEs then materializes only the CTEs that the inplace path did not already handle, with main's outer step degrading to a pass-through in those cases. Closes #102320. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sted IN Issue #102320: the CTE `a` is referenced once at top level and once inside a WHERE id IN (... WHERE id IN (...)) chain that hits a MergeTree primary key. The PK pruning on the outer IN-subquery triggers buildOrderedSetInplace for the inner IN-subquery, which used to read an empty StorageMemory because main's resolveMaterializingCTEs had already claimed the CTE before addStepsToBuildSets fired the inplace path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… CreatingSetStep wrapper Two fixes for cases the initial single-step strip missed: 1. `addBuildSubqueriesForMaterializedCTEsIfNeeded` stacks one `DelayedMaterializingCTEsStep` per CTE dependency level on top of a plan. An IN-subquery that transitively references a chain of CTEs gets multiple safety-net steps stacked, so stripping just the topmost one leaves the next one in place. The remaining safety-net claims its CTE in the wrong scope when the recursive `plan->optimize` below runs `resolveMaterializingCTEs`, attaching that CTE's plan inside the set subquery's `CreatingSetsStep` instead of at the outer `MaterializingCTEsStep` in the main plan. The fix loops while the top of the chain is `DelayedMaterializingCTEsStep`. 2. `DelayedCreatingSetsStep::makePlansForSets` calls `FutureSetFromSubquery::build` first, which wraps the source plan with `CreatingSetStep`. That makes the plan root a `CreatingSetStep` with the safety-net chain as its child, not at the root itself. The helper now peeks past one non-Delayed wrapper to reach the chain. Without (1), the third query in `03821_materialized_cte_with_sets` (`WITH b AS MATERIALIZED (SELECT uid FROM a, a), a AS MATERIALIZED ... SELECT count() FROM (SELECT * FROM b WHERE uid in b) ... LEFT SEMI JOIN a ...`) regressed because the main IN-subquery's safety-net stack was `Delayed[a] → Delayed[b] → body`: stripping only the top left `Delayed[b]` in place, and the recursive `plan->optimize` then claimed `b` inside the IN-subquery's `CreatingSetsStep`, leaving the join's right side reading `a.storage` before the outer step had a chance to materialize it. Without (2), the helper was a no-op for any plan reached via `makePlansForSets` (since `build` always wraps with `CreatingSetStep`), which is the only call site for it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lets DelayedMaterializingCTEsStep::optimizePlans dedup the plan->optimize(...) call across multiple DelayedMaterializingCTEsStep instances that share the same MaterializedCTE pointer (e.g. UNION branches plus the UNION-level step). Logically parallel to the existing is_materialization_planned flag, but for the optimize stage rather than the claim/move stage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this commit, `DelayedMaterializingCTEsStep::optimizePlans` (pass 1 of `resolveMaterializingCTEs`) called `cte->plan->optimize(...)` once per Delayed step that referenced the CTE. For shapes where multiple Delayed steps share one `MaterializedCTE` pointer -- e.g. UNION/INTERSECT/EXCEPT branches plus the UNION-level step, all planted by `addBuildSubqueriesForMaterializedCTEsIfNeeded` -- the same plan was optimized N times. Optimize is idempotent, so this was a CPU regression rather than a correctness bug, but it scales with the number of UNION branches and the optimization itself walks the CTE plan tree (and recursively triggers no-op `buildSetInplace` calls on already-built sets). Claim `is_plan_optimized` atomically before invoking optimize, mirroring the existing `is_materialization_planned` claim in `makePlansForCTEs`. Addresses review comment on PR #105041: #105041 (comment) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…al shape The dedup commit (3a0728b + 6c08747) reduces redundant plan->optimize(...) invocations across DelayedMaterializingCTEsStep instances that share a MaterializedCTE pointer. With the dedup in place, mergeExpressions no longer fires the second time during pass 1 of resolveMaterializingCTEs and the two adjacent ExpressionSteps in the local Union branch stay unmerged -- matching the pre-PR shape this test originally expected. Reverts the cosmetic update introduced in 5b2a827 (which adjusted the reference for the merged shape that pass-1's redundant optimize calls used to produce). Functional results (count() = 3 twice) are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per the project rule "When mentioning logical errors, say 'exception' instead of 'crash', because they don't crash the server in the release build." Pure comment edit, no behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1. Start subquery execution only after the first call to MaterializingCTETransform::work 2. Use std::promise for reader and writer synchronisation
Enumerates every path that creates a MemorySource reading a materialized CTE's StorageMemory and shows each is downstream of a DelayedPortsProcessor gated on the corresponding MaterializingCTETransform. Pre-requisite for removing the build_promise/build_future runtime synchronisation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Set with memory_order_release in MaterializingCTETransform::generate after executor->finish() returns. Not yet read by any consumer - this is the first of three additive commits that introduce a fail-fast assertion against planner wiring bugs before removing the build_promise/build_future synchronisation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `DelayedPortsProcessor` inserted by `MaterializingCTEsStep::updatePipeline` already gates every reader's `MemorySource` on the corresponding `MaterializingCTETransform` finishing, so by the time `MemorySource::generate` runs the writer must already have set `is_built` to `true`. If the assertion ever fires, the planner failed to wire the gate - surface a precise `LOGICAL_ERROR` rather than blocking on `build_future.get` or reading from a half-populated `StorageMemory`. The `build_future.get` wait is kept for one more commit as defence in depth, to be removed once the full test suite has run cleanly against the new assertion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `DelayedPortsProcessor` gate inserted by `MaterializingCTEsStep` already ensures the writer has finished before any reader's `MemorySource` is scheduled, so the runtime wait is redundant. The `is_built` assertion added in the previous commit holds across the full materialized-CTE test bucket, confirming every reader path is gated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reader/writer ordering for materialized CTEs is already enforced by the `DelayedPortsProcessor` inserted by `MaterializingCTEsStep::updatePipeline` (via `addPipelineBefore`) at the scheduler level: back-pressure from its un-`setNeeded()` delayed inputs cascades up to every reader's `MemorySource`, so `generate` is not invoked before the writer's `MaterializingCTETransform` has finished. The runtime `build_future.get` inside `ReadFromMemoryStorageStep` (already removed earlier in the branch) duplicated this guarantee with a thread-blocking wait; the matching `build_promise.set_*` sites raised `Exception(LOGICAL_ERROR, ...)` on legitimate non-execution paths like `EXPLAIN PIPELINE` / `EXPLAIN ESTIMATE`, which `abortOnFailedAssertion` in the `Exception` ctor turned into a process abort under debug and sanitizer builds (PR #105041 AST fuzzer). Replaces the `std::promise<void>` / `std::shared_future<void>` pair in `MaterializedCTE` with a single `std::atomic_bool is_built`, set with `memory_order_release` from `MaterializingCTETransform::generate` after `executor->finish()` and read with `memory_order_acquire` by the fail-fast assertion in `ReadFromMemoryStorageStep` (added earlier in the branch). Reimplements `MaterializedCTE::isBuilt` to load the atomic so external callers in `Planner.cpp` and `RemoteQueryExecutor.cpp` need no migration. Removes `MaterializingCTETransform::onCancel`, the `work` try/catch wrapper, the destructor fallback, and the `build_promise.set_value` call - every site whose sole purpose was to fulfil the promise. Cancellation propagates through `PipelineExecutor`'s normal cancel path; writer exceptions propagate to the executor and surface as the query's terminal error. Preserves the lazy `init` split introduced in `d60f24195e3` (the constructor no longer eagerly opens the temp-table write pipeline; the first `work` call invokes `init`) and the `executor->cancel` cleanup in the destructor. Adds a regression test (`04260_explain_pipeline_materialized_cte.sql`) covering `EXPLAIN PIPELINE` / `EXPLAIN PIPELINE graph=1` / `EXPLAIN ESTIMATE` over a `SELECT` with `MATERIALIZED` CTEs, plus the `IN (subquery)` shape that exercises the `buildOrderedSetInplace` safety-net path. Report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=105041&sha=latest&name_0=PR Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a one-line comment explaining why `executor->cancel()` is wrapped in a
log-and-swallow `try/catch` in the destructor. Pre-refactor this was implicit
("the destructor still has work to do below"); post-refactor (after the
promise/future removal) the destructor's body is solely the cancel, so the
swallowing pattern is the entire reason for the `try/catch`. Spelling it
out prevents a future reader from suspecting a missing fallback.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes a regression introduced by removing the `build_future.get()` wait in `MemorySource::generate` (commit c9e0a9f): a materialized CTE referenced from a UNION inside an IN-subquery against a non-PK column tripped the `is_built` fail-fast at `ReadFromMemoryStorageStep.cpp:86` because `removeTopLevelDelayedMaterializingCTEsStep` stripped only the union-level safety-net `DelayedMaterializingCTEsStep`, leaving the per-branch ones in the sub-plan tree. The recursive `plan->optimize` inside `DelayedCreatingSetsStep::makePlansForSets` then let branch 1's per-branch step win `is_materialization_planned.exchange(true)`, parking the writer inside that branch and leaving branch 2's reader and the outer plan's `MaterializingCTEsStep` ungated. Renames the helper to `removeAllDelayedMaterializingCTEsStep` and rewrites it as a full tree walk that replaces every `DelayedMaterializingCTEsStep` node anywhere in the sub-plan with its single child. Nested `DelayedCreatingSetsStep` source plans (held in `subqueries`, not in the immediate tree) keep their safety-nets for their own `buildSetInplace` consumers. Adds tests/queries/0_stateless/04295_04278_materialized_cte_in_union_runtime_set.{sql,reference} with two pinned shapes: a minimal `WHERE b IN (SELECT FROM ct UNION ALL SELECT FROM ct)` against a `MergeTree ORDER BY a` table where `b` is non-PK (so the inplace path does not fire), plus the verbatim AST-fuzzer query that originally surfaced the failure. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=105041&sha=latest&name_0=PR PR: #105041 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses review feedback on PR #105959 (comment r3319962590): the `is_built.store(true, release)` introduced in 5b78bc7 dropped the runtime backstop for the `MaterializedCTE` single-writer invariant. The plan-time guard `is_materialization_planned.exchange(true)` in `DelayedMaterializingCTEsStep::makePlansForCTEs` is the primary enforcement, but a future planner regression that wires two `MaterializingCTETransform`s for the same `MaterializedCTE` would now silently double-write into the shared `StorageMemory` and return duplicated CTE rows instead of throwing - exactly the class of planner topology bug the prior commit on this branch (9335a9c, "Drop all DelayedMaterializingCTEStep's in the Set plan") had to fix once. Replaces the plain release-store with `exchange(true, std::memory_order_release)` + conditional throw, so the publish ordering is preserved (still pairs with the acquire-load in `ReadFromMemoryStorageStep.cpp:86`) and the double-write is detected at runtime. Cost is one atomic CAS instead of one atomic store on a path that runs once per CTE materialization. Review comment: #105959 (comment) PR: #105959 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A query with materialized CTEs whose runtime sets read another materialized CTE could throw `Reading from materialized CTE '...' before its materialization completed - DelayedPortsProcessor gate is missing in the query plan`. Root cause: when building a `CreatingSetsTransform` / `MaterializingCTETransform`, the source pipeline's totals/extremes were dropped via a childless `NullSink`. `ExecutingGraph::initializeExecution` seeds every childless node, so seeding that `NullSink` cascaded `setNeeded` up to the set-builder that reads the not-yet-built CTE, activating it before the writer ran and tripping the fail-fast check in `MemorySource::generate`. Replace the childless `NullSink` drop with a new `DroppingTransform` that consumes all of the source pipeline's output ports (data plus totals and extremes), forwards the data streams and discards totals and extremes. Because it keeps its data outputs connected, it is not a childless node, so `initializeExecution` does not seed it and the set-builder is only activated through the normal gated data path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fix LOGICAL_ERROR for reused materialized CTE filtered by IN over another materialized CTE
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Original pull-request #105041
Do not merge this PR manually
This pull-request is a first step of an automated backporting.
It contains changes similar to calling
git cherry-picklocally.If you intend to continue backporting the changes, then resolve all conflicts if any.
Otherwise, if you do not want to backport them, then just close this pull-request.
The check results does not matter at this step - you can safely ignore them.
Troubleshooting
If the conflicts were resolved in a wrong way
If this cherry-pick PR is completely screwed by a wrong conflicts resolution, and you want to recreate it:
pr-cherrypicklabel from the PRYou also need to check the Original pull-request for
pr-backports-createdlabel, and delete if it's presented thereThe PR source
The PR is created in the CI job