Fix LOGICAL_ERROR for reused materialized CTE filtered by IN over another materialized CTE by novikd · Pull Request #105041 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix LOGICAL_ERROR for reused materialized CTE filtered by IN over another materialized CTE#105041

Merged
alexey-milovidov merged 34 commits into
masterfrom
fix-mat-cte
Jun 6, 2026
Merged

Fix LOGICAL_ERROR for reused materialized CTE filtered by IN over another materialized CTE#105041
alexey-milovidov merged 34 commits into
masterfrom
fix-mat-cte

Conversation

@novikd

@novikd novikd commented May 15, 2026

Copy link
Copy Markdown
Member

Motivation

Two related shapes of LOGICAL_ERROR: Reading from materialized CTE 'X' before it has been materialized triggered by enable_materialized_cte.

Issue #101940:

WITH
    ct AS MATERIALIZED (SELECT c FROM t LIMIT 10),
    rs AS MATERIALIZED (SELECT * FROM t WHERE c IN (SELECT c FROM ct))
SELECT count()
FROM rs AS a ANY LEFT JOIN rs AS b USING c
SETTINGS enable_materialized_cte = 1;

Issue #102320:

WITH a AS MATERIALIZED (SELECT number AS id FROM numbers(100))
SELECT id FROM a
WHERE id IN (
    SELECT id FROM test
    WHERE id IN (SELECT id FROM a GROUP BY id)
    GROUP BY id
)
GROUP BY id;

EXPLAIN on either query also threw — the bug fires during plan
optimization, not just at execution time.

Root cause

The Planner installs two layers of DelayedMaterializingCTEsStep:

  1. Outer steps at the top of the main plan, one per CTE dependency level,
    added by addBuildSubqueriesForMaterializedCTEsIfNeeded so the materialized
    CTEs run before the main pipeline at runtime.
  2. Safety-net steps inside every IN-subquery's plan, planted by
    forceMaterializeCTE so the synchronous inplace path
    (FutureSetFromSubquery::buildSetInplace / buildOrderedSetInplace,
    reached via KeyCondition::tryPrepareSetIndexForIn or
    VirtualColumnUtils::buildSetsForDAG) can materialize a referenced CTE
    during plan optimization.

makePlansForCTEs did the atomic claim, the optimize call, and the
std::move of materialized_cte->plan in one step, and
resolveMaterializingCTEs ran before addStepsToBuildSets. So in both bug
shapes the outer step claimed the CTE and moved its plan out before the
inner safety-net inside the inplace pipeline could win the race —
the safety-net pass-throughs read an empty StorageMemory.

Fix

Three coordinated changes:

  1. Split resolveMaterializingCTEs into two traversals. Pass 1
    (DelayedMaterializingCTEsStep::optimizePlans) just optimizes each CTE's
    pre-built plan; pass 2 (the existing addPlansForMaterializingCTEs walk)
    claims and attaches. makePlansForCTEs no longer optimizes, only claims
    and std::moves. With this alone, the failing query from SEGV / LOGICAL_ERROR when materialized CTE is used in IN subquery of another materialized CTE referenced in a join #101940 is
    fixed because the safety-net inside rs.plan->optimize claims ct
    during pass 1, before main's pass 2 attempts the outer claim.

  2. Run addStepsToBuildSets before resolveMaterializingCTEs in
    QueryPlan::optimize.
    When the inplace path fires from
    addStepsToBuildSets on the main plan (LOGICAL_ERROR with materialized CTE #102320's shape, where the IN
    target is a MergeTree column and the PK pruning fires during the
    recursive outer_IN.plan->optimize), the safety-net inside the nested
    IN-subquery is now reached before the outer claim runs, again winning
    the race.

  3. Strip safety-net DelayedMaterializingCTEsStep chains from set plans
    in DelayedCreatingSetsStep::makePlansForSets.
    When a set plan is
    being attached for runtime construction (vs synchronous inplace
    execution), its safety-nets are redundant — at runtime the outer
    MaterializingCTEsStep materializes the CTE before the
    CreatingSetsStep's pipeline runs. Stripping the whole stacked chain
    prevents the recursive plan->optimize from claiming the CTE in the
    wrong scope when the safety-net chain has multiple levels (one per
    transitive CTE dependency). The strip handles CreatingSetStep wrapping
    the chain (added by FutureSetFromSubquery::build).

Test plan

Changelog category (leave one):

  • Critical Bug Fix (crash, data loss, RBAC)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fixed two LOGICAL_ERROR: Reading from materialized CTE 'X' before it has been materialized shapes thrown by queries with enable_materialized_cte: (1) a reused materialized CTE filtered by IN (subquery) over another materialized CTE, and (2) a materialized CTE referenced both directly and inside a WHERE ... IN (...) filter that hits a MergeTree primary key through a nested IN-subquery. EXPLAIN on the same queries was affected too because the bugs fired during plan optimization. Closes #101940, closes #102320.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Version info

  • Merged into: 26.6.1.454

novikd and others added 4 commits May 15, 2026 16:08
The optimization step is moved one level up, into a dedicated first
traversal of resolveMaterializingCTEs introduced in the next commit.
This lets the recursive buildSetInplace / buildOrderedSetInplace path
win the inner safety-net DelayedMaterializingCTEsStep's claim before
the outer step attempts its own claim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first traversal optimizes every materialized CTE's pre-built plan
and reports whether any DelayedMaterializingCTEsStep exists in the tree.
The second traversal -- skipped entirely when the first saw none --
replaces DelayedMaterializingCTEsStep nodes with MaterializingCTEsStep,
attaching plans for any CTEs not already claimed inplace.

Why this fixes the LOGICAL_ERROR
--------------------------------
Today makePlansForCTEs did three things atomically: it exchange'd
is_materialization_planned, it called materialized_cte->plan->optimize,
and it std::moved the plan into the synthesized MaterializingCTEsStep.
The outer DelayedMaterializingCTEsStep[ct] in the main plan won the
exchange before descending into [rs]; by the time rs.plan->optimize
fired buildSetInplace via KeyCondition or VirtualColumnUtils, the inner
safety-net DelayedMaterializingCTEsStep inside the IN-subquery's plan
found ct.is_materialization_planned already true and degraded to a
pass-through. The synchronously executed pipeline read ct.storage with
is_built == false and threw.

With two traversals, the optimize calls happen first across the entire
tree, so the recursive buildSetInplace path gets to run
makePlansForCTEs on the inner safety-net step before the outer step's
claim is attempted. The outer step then sees the CTE already claimed
and degrades to a harmless pass-through.

For queries with no materialized CTEs at all (the common case), the
first traversal walks the tree without doing any work, reports no
DelayedMaterializingCTEsStep, and returns -- keeping the total work at
one walk just like before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…er CTE

Covers the LOGICAL_ERROR path that fired when the outer
DelayedMaterializingCTEsStep claimed the inner CTE before buildSetInplace
had a chance to materialize it synchronously for primary-key /
virtual-column analysis. Three query shapes: ANY LEFT JOIN on PK,
cross-join on PK, ANY LEFT JOIN with the IN filter on a non-PK column
(exercising the VirtualColumnUtils path). EXPLAIN is also checked
because the bug fired during plan optimization -- the test only
asserts that EXPLAIN produced at least one row so the reference file
is not pinned to a specific plan shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… Expressions

The two-traversal refactor in resolveMaterializingCTEs shifts the
timing of `materialized_cte->plan->optimize` such that
mergeExpressions now successfully collapses two adjacent
ExpressionSteps in the materialized CTE's local Union branch
(Project names + Change column names | Projection + Change column
names to column identifiers).

Functional results are unchanged (count() = 3 twice). The merged
ExpressionStep performs identical work in a single transform.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@clickhouse-gh

clickhouse-gh Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label May 15, 2026
After the two-traversal split, makePlansForCTEs no longer calls
plan->optimize -- that happens in DelayedMaterializingCTEsStep::optimizePlans
during the first traversal. Drop the now-unused QueryPlanOptimizationSettings
argument from the public method and from the local
addPlansForMaterializingCTEs helper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@novikd novikd marked this pull request as ready for review May 15, 2026 15:10
@EmeraldShift

Copy link
Copy Markdown
Contributor

@alexey-milovidov

This comment was marked as resolved.

alexey-milovidov and others added 11 commits May 17, 2026 21:08
A small utility to splice out a top-level `DelayedMaterializingCTEsStep`
from a plan, replacing it with its single original child. Used by the
upcoming change to `DelayedCreatingSetsStep::makePlansForSets` to strip
the safety-net materialization step from set plans being attached for
runtime execution.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a set's plan is being attached for runtime construction (via
DelayedCreatingSetsStep → CreatingSetsStep), the safety-net
DelayedMaterializingCTEsStep planted by forceMaterializeCTE is no longer
needed: at runtime the outer MaterializingCTEsStep in the main plan
materializes the CTE before the CreatingSetsStep's pipeline runs.

The strip is non-recursive on purpose. Nested set subquery plans
(reached via DelayedCreatingSetsStep inside this plan) may still be
consumed by buildSetInplace / buildOrderedSetInplace during the
subsequent plan->optimize() and need their safety-nets to claim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous order claimed materialized-CTE plans at the outer level
during resolveMaterializingCTEs before addStepsToBuildSets had a chance
to trigger synchronous buildSetInplace / buildOrderedSetInplace via the
recursive plan->optimize. When that synchronous inplace path then ran
on a deeply nested IN-subquery filtered by a MergeTree primary key, the
safety-net DelayedMaterializingCTEsStep inside the inplace pipeline saw
the CTE's plan already moved out and degraded to a pass-through,
producing LOGICAL_ERROR while reading the still-empty StorageMemory.

Swapping the order lets the inplace path claim the CTE via the safety-net
first; resolveMaterializingCTEs then materializes only the CTEs that the
inplace path did not already handle, with main's outer step degrading to
a pass-through in those cases.

Closes #102320.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sted IN

Issue #102320: the CTE `a` is referenced once at top level and once inside
a WHERE id IN (... WHERE id IN (...)) chain that hits a MergeTree primary
key. The PK pruning on the outer IN-subquery triggers buildOrderedSetInplace
for the inner IN-subquery, which used to read an empty StorageMemory because
main's resolveMaterializingCTEs had already claimed the CTE before
addStepsToBuildSets fired the inplace path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… CreatingSetStep wrapper

Two fixes for cases the initial single-step strip missed:

1. `addBuildSubqueriesForMaterializedCTEsIfNeeded` stacks one
   `DelayedMaterializingCTEsStep` per CTE dependency level on top of a
   plan. An IN-subquery that transitively references a chain of CTEs
   gets multiple safety-net steps stacked, so stripping just the
   topmost one leaves the next one in place. The remaining safety-net
   claims its CTE in the wrong scope when the recursive `plan->optimize`
   below runs `resolveMaterializingCTEs`, attaching that CTE's plan
   inside the set subquery's `CreatingSetsStep` instead of at the outer
   `MaterializingCTEsStep` in the main plan. The fix loops while the
   top of the chain is `DelayedMaterializingCTEsStep`.

2. `DelayedCreatingSetsStep::makePlansForSets` calls
   `FutureSetFromSubquery::build` first, which wraps the source plan
   with `CreatingSetStep`. That makes the plan root a `CreatingSetStep`
   with the safety-net chain as its child, not at the root itself. The
   helper now peeks past one non-Delayed wrapper to reach the chain.

Without (1), the third query in `03821_materialized_cte_with_sets`
(`WITH b AS MATERIALIZED (SELECT uid FROM a, a), a AS MATERIALIZED ...
SELECT count() FROM (SELECT * FROM b WHERE uid in b) ... LEFT SEMI JOIN
a ...`) regressed because the main IN-subquery's safety-net stack was
`Delayed[a] → Delayed[b] → body`: stripping only the top left `Delayed[b]`
in place, and the recursive `plan->optimize` then claimed `b` inside the
IN-subquery's `CreatingSetsStep`, leaving the join's right side reading
`a.storage` before the outer step had a chance to materialize it.

Without (2), the helper was a no-op for any plan reached via
`makePlansForSets` (since `build` always wraps with `CreatingSetStep`),
which is the only call site for it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lets DelayedMaterializingCTEsStep::optimizePlans dedup the
plan->optimize(...) call across multiple DelayedMaterializingCTEsStep
instances that share the same MaterializedCTE pointer (e.g. UNION
branches plus the UNION-level step). Logically parallel to the existing
is_materialization_planned flag, but for the optimize stage rather than
the claim/move stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this commit, `DelayedMaterializingCTEsStep::optimizePlans` (pass 1
of `resolveMaterializingCTEs`) called `cte->plan->optimize(...)` once per
Delayed step that referenced the CTE. For shapes where multiple Delayed
steps share one `MaterializedCTE` pointer -- e.g. UNION/INTERSECT/EXCEPT
branches plus the UNION-level step, all planted by
`addBuildSubqueriesForMaterializedCTEsIfNeeded` -- the same plan was
optimized N times. Optimize is idempotent, so this was a CPU regression
rather than a correctness bug, but it scales with the number of UNION
branches and the optimization itself walks the CTE plan tree (and
recursively triggers no-op `buildSetInplace` calls on already-built
sets).

Claim `is_plan_optimized` atomically before invoking optimize, mirroring
the existing `is_materialization_planned` claim in `makePlansForCTEs`.

Addresses review comment on PR #105041:
#105041 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…al shape

The dedup commit (3a0728b + 6c08747) reduces redundant
plan->optimize(...) invocations across DelayedMaterializingCTEsStep
instances that share a MaterializedCTE pointer. With the dedup in
place, mergeExpressions no longer fires the second time during pass 1
of resolveMaterializingCTEs and the two adjacent ExpressionSteps in
the local Union branch stay unmerged -- matching the pre-PR shape this
test originally expected.

Reverts the cosmetic update introduced in 5b2a827 (which adjusted
the reference for the merged shape that pass-1's redundant optimize
calls used to produce). Functional results (count() = 3 twice) are
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per the project rule "When mentioning logical errors, say 'exception'
instead of 'crash', because they don't crash the server in the release
build." Pure comment edit, no behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@clickhouse-gh clickhouse-gh Bot added pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! and removed pr-bugfix Pull request with bugfix, not backported by default labels May 18, 2026
@novikd

novikd commented May 18, 2026

Copy link
Copy Markdown
Member Author

The fix seems to be working; the only thing is that Bugfix validation (functional tests) is somehow broken (it seems it does not handle LOGICAL_ERROR without this fix properly).

@novikd

novikd commented May 19, 2026

Copy link
Copy Markdown
Member Author

Bugfix validation check was broken in #104081.

Details
  What actually happened on master (no-fix build)

  From bugfix_validation_functional_tests/job.log:

  - 14:19:09 — clickhouse-test … --jobs 9 -- 04227_… 04228_… starts
  - 14:20:15 — Hung check failed: server is not responding
  - 14:20:24 — Unable to locate any ClickHouse server process. It must have crashed or 
  exited prematurely!
  - 14:20:24 — Stopping tests, terminating all processes...
  - workers raise Terminated: 15 and exit before writing any [ FAIL ] line
  - 14:20:29 — Failed to reproduce the bug

  The LOGICAL_ERROR aborted the server (debug build), as evidenced by the artifact
  core.QueryCompPipeEx.1592-275296.zst.enc (core dump from the QueryCompletedPipelineEx…
  thread, 15-char comm truncation).

  Why the validation misreads it

  ci/jobs/scripts/functional_tests_results.py:13-15 defines:

  HUNG_SIGN         = "Found hung queries in processlist"
  SERVER_DIED_SIGN  = "Server died, terminating all processes"
  SERVER_DIED_SIGN2 = "Server does not respond to health check"

  None of these match what clickhouse-test actually emits today on the global-hung-check
  / server-crash path (tests/clickhouse-test:1117, 4898, 4903):

  ┌────────────────────────┬─────────────────────────────────────────────────────────┐
  │    Parser looks for    │             clickhouse-test actually emits              │
  ├────────────────────────┼─────────────────────────────────────────────────────────┤
  │ "Found hung queries in │ "Hung check failed: server is not responding"           │
  │  processlist"          │                                                         │
  ├────────────────────────┼─────────────────────────────────────────────────────────┤
  │ "Server died,          │                                                         │
  │ terminating all        │ "Stopping tests, terminating all processes..."          │
  │ processes"             │                                                         │
  ├────────────────────────┼─────────────────────────────────────────────────────────┤
  │ "Server does not       │ only in the per-test cleanup path                       │
  │ respond to health      │ (clickhouse-test:3444), which never runs here because   │
  │ check"                 │ workers were SIGTERM'd first                            │
  └────────────────────────┴─────────────────────────────────────────────────────────┘

/// `subqueries`, not as children of any node in this tree) are
/// untouched — they keep their safety-nets for their own
/// `buildSetInplace` / `buildOrderedSetInplace` consumers.
removeAllDelayedMaterializingCTEsStep(*plan);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unconditional strip assumes the set plan is being attached under a SELECT plan that still has an outer DelayedMaterializingCTEsStep for every referenced CTE. That is not true for the other DelayedCreatingSetsStep callers: MutationsInterpreter::buildSubqueryPlansForSetsAndAdd builds IN-subquery plans with forceMaterializeCTE, but the mutation plan never adds an outer materialized-CTE wrapper.

For a shape like ALTER TABLE ... UPDATE ... WHERE x IN (WITH c AS MATERIALIZED (...) SELECT ... FROM c), this removes the only DelayedMaterializingCTEsStep; plan->optimize then attaches a runtime set plan whose ReadFromMemoryStorageStep has no writer/gate and will hit the new is_built fail-fast (or read an empty StorageMemory without it). Please only strip these safety nets when the caller has an enclosing materialization step, or teach the mutation/non-SELECT callers to add that outer CTE materialization before DelayedCreatingSetsStep.


bool has_totals = pipe.getTotalsPort() != nullptr;
bool has_extremes = pipe.getExtremesPort() != nullptr;
auto dropping = std::make_shared<DroppingTransform>(header, pipe.numOutputPorts(), has_totals, has_extremes);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DroppingTransform is constructed with the main data header for every input, including totals/extremes. That loses the old dropTotals behavior, which created a NullSink from each aux port's own header. Pipe::addTransform explicitly permits totals to have a different header from the data stream because TotalsHavingTransform can output finalized totals while data is not finalized.

In that valid shape, connect(*totals_port, *totals_in) will now throw while trying to discard totals, so a set/CTE subquery with such totals cannot be materialized. Please give DroppingTransform separate headers for data, totals, and extremes (or build the aux inputs from pipe.getTotalsPort()->getSharedHeader() / pipe.getExtremesPort()->getSharedHeader()) while keeping only the data outputs on the main header.

@clickhouse-gh

clickhouse-gh Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.40% 84.50% +0.10%
Functions 92.30% 92.30% +0.00%
Branches 77.00% 77.10% +0.10%

Changed lines: Changed C/C++ lines covered by tests: 176/201 (87.56%) | Lost baseline coverage: none · Uncovered code

Full report · Diff report

@alexey-milovidov alexey-milovidov added this pull request to the merge queue Jun 6, 2026
Merged via the queue into master with commit 7c6a92c Jun 6, 2026
166 checks passed
@alexey-milovidov alexey-milovidov deleted the fix-mat-cte branch June 6, 2026 16:09
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jun 6, 2026
robot-ch-test-poll4 added a commit that referenced this pull request Jun 6, 2026
Cherry pick #105041 to 26.4: Fix LOGICAL_ERROR for reused materialized CTE filtered by IN over another materialized CTE
robot-clickhouse added a commit that referenced this pull request Jun 6, 2026
…TE filtered by IN over another materialized CTE
robot-ch-test-poll4 added a commit that referenced this pull request Jun 6, 2026
Cherry pick #105041 to 26.5: Fix LOGICAL_ERROR for reused materialized CTE filtered by IN over another materialized CTE
robot-clickhouse added a commit that referenced this pull request Jun 6, 2026
…TE filtered by IN over another materialized CTE
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Jun 6, 2026
@alexey-milovidov

Copy link
Copy Markdown
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

8 participants