Fix Block structure mismatch in UnionStep/IntersectOrExceptStep when a branch constant-folds by groeneai · Pull Request #107719 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix Block structure mismatch in UnionStep/IntersectOrExceptStep when a branch constant-folds#107719

Open
groeneai wants to merge 1 commit into
ClickHouse:masterfrom
groeneai:groeneai/fix-union-const-fold-header-106956
Open

Fix Block structure mismatch in UnionStep/IntersectOrExceptStep when a branch constant-folds#107719
groeneai wants to merge 1 commit into
ClickHouse:masterfrom
groeneai:groeneai/fix-union-const-fold-header-106956

Conversation

@groeneai

Copy link
Copy Markdown
Contributor

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fixed a Block structure mismatch in UnionStep stream logical error (server abort on debug/sanitizer builds, Code: 49 on release builds) that occurred when sibling branches of a UNION/INTERSECT/EXCEPT differed only in their WHERE predicate and one branch's predicate constant-folded to a Const column.

Description

Reported in #106956. Split out of #107041 at the reviewer's request (the sparse half stays in #107041, this PR covers the constant-folding half only).

Reproduces deterministically with default settings, no tables needed:

SELECT t2.a, t1.b = t2.b
FROM (SELECT 1 AS a, 2 AS b) AS t1
INNER JOIN (SELECT 1 AS a, 2 AS b) AS t2 ON t1.a = t2.a
WHERE t1.b = t2.b
UNION ALL
SELECT t2.a, t1.b = t2.b
FROM (SELECT 1 AS a, 2 AS b) AS t1
INNER JOIN (SELECT 1 AS a, 2 AS b) AS t2 ON t1.a = t2.a
WHERE NOT (t1.b = t2.b);

Set-operation branches are optimized independently. When one branch's predicate constant-folds (here NOT (t1.b = t2.b) over constant join inputs), filter push-down through the join leaves that branch's projection column Const while the sibling keeps a full column. The plan-time header check in UnionStep::updateOutputHeader / IntersectOrExceptStep::updateOutputHeader used strict assertBlocksHaveEqualStructure and rejected the divergent constness, even though the execution-time path (updatePipeline) already tolerates exactly this via makeConvertingActions.

The fix makes the plan-time check tolerant the same way: it requires matching column names and types but allows a Const to meet a full column, and computes the common header with getLeastSuperColumn (the primitive the planner already uses to build the union header) so a column stays Const only when every branch agrees and is widened to a full column otherwise. When no branch constness diverges the front header is returned unchanged, so plans for ordinary set operations are byte-identical to before (no new plan nodes, no EXPLAIN changes).

Added regression test 04327_union_branch_const_header_mismatch (covering UNION ALL, UNION DISTINCT, SELECT DISTINCT, an explicit CAST(..., 'Nullable(UInt8)') projection, INTERSECT, and EXCEPT).

@groeneai

Copy link
Copy Markdown
Contributor Author

@groeneai

Copy link
Copy Markdown
Contributor Author

cc @KochetovNicolai @davenger - const-fold half split out of #107041 per review. Makes the plan-time UnionStep/IntersectOrExceptStep header check tolerate a Const branch meeting a full one via getLeastSuperColumn, matching what updatePipeline already does at execution time.

@nikitamikhaylov nikitamikhaylov added the can be tested Allows running workflows for external contributors label Jun 17, 2026
@clickhouse-gh

clickhouse-gh Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [cfe8828]

Summary:

job_name test_name status info comment
Stateless tests (amd_llvm_coverage, old analyzer, s3 storage, DatabaseReplicated, WasmEdge, parallel) FAIL
01418_query_scope_constants_and_remote FAIL cidb

AI Review

Summary
  • This PR makes UnionStep and IntersectOrExceptStep tolerate optimizer-introduced top-level Const/full-column divergence across set-operation branches, materializes the common header when branch constness or constant values diverge, and skips liftUpUnion when a union no longer forwards branch headers unchanged. The current code addresses the earlier review threads and includes focused regression coverage for UNION, INTERSECT, and EXCEPT.
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Jun 17, 2026
@KochetovNicolai

Copy link
Copy Markdown
Member

The fix looks too complex.
Initially, UNION streams should have an identical structure. Mixed const and non-const columns should be allowed, common header should drop the const in this case. Different types are not allowed.

Filter-push-down can fold the constant, but this should not break the logic.

@groeneai groeneai force-pushed the groeneai/fix-union-const-fold-header-106956 branch from 2441ef7 to 4493fad Compare June 17, 2026 16:51
@groeneai

Copy link
Copy Markdown
Contributor Author

@KochetovNicolai Simplified, thanks. Pushed 4493fad.

The check now keeps assertCompatibleHeader (identical structure: same names and types, different types still rejected, plus the const-value equality check), and the only relaxation is that a Const may meet a full column — in which case the common header just drops the const for that column. Dropped the getLeastSuperColumn/supertype machinery; types are already guaranteed equal by the check above, so there was nothing to widen.

Filter-push-down folding a constant in one branch is handled by that single rule: the column becomes full in the common header. No-divergence plans return the front header unchanged.

Verified on a debug build: 04327 aborts before the change (SIGABRT, Code 49) and passes after, with output matching the reference; const-vs-const same-value still keeps the const.

Comment thread src/Processors/QueryPlan/UnionStep.cpp Outdated
@groeneai groeneai force-pushed the groeneai/fix-union-const-fold-header-106956 branch from 4493fad to ffd0383 Compare June 17, 2026 18:05
@groeneai

Copy link
Copy Markdown
Contributor Author

Good catch, confirmed. assertCompatibleHeader runs with allow_materialize=true, which getActualColumn uses to strip Const, Sparse, and Replicated, while the loop only dropped top-level Const, so a Sparse/full or Replicated/full divergence passed unnormalized.

Fixed by keeping the check strict except for top-level Const (direction a). Each branch is now compared to the front with only its top-level Const stripped via convertToFullColumnIfConst (which leaves Sparse/Replicated wrappers untouched), then validated with strict assertBlocksHaveEqualStructure. So only a Const meeting a full column is tolerated; different types and divergent Sparse/Replicated representations are rejected at plan time as before, and cannot leak into later pipeline construction.

convertToFullColumnIfConst is overridden only by ColumnConst (Block.cpp:49-62 / ColumnConst.h); ColumnSparse and ColumnReplicated inherit the IColumn no-op, so the strict compare still sees their distinct representation.

Verified: 04327 passes (20/20 with randomization); const/full UNION/INTERSECT/EXCEPT drop the const to full; all-const-agree keeps the front header unchanged; a top-level type difference is still rejected.

Pushed in ffd0383.

Comment thread src/Processors/QueryPlan/UnionStep.cpp Outdated
…a branch constant-folds

A UNION/INTERSECT/EXCEPT whose branches differ only in their WHERE predicate
threw `Block structure mismatch in UnionStep stream` (Code 49), aborting
debug/sanitizer servers, when one branch's predicate constant-folded.

Set-operation branches are optimized independently. Filter push-down through a
join in one branch can constant-fold a projection column, leaving that branch's
output column Const while the sibling branch stays full. The plan-time header
check in UnionStep/IntersectOrExceptStep::updateOutputHeader used strict
assertBlocksHaveEqualStructure and rejected the divergent constness, even though
updatePipeline already tolerates exactly this at execution time.

The check stays strict except for top-level Const. Each branch is compared to
the front branch with only its top-level Const stripped (via
convertToFullColumnIfConst, which leaves Sparse and Replicated wrappers
untouched), then validated with strict assertBlocksHaveEqualStructure. Different
types and divergent Sparse/Replicated representations remain rejected at plan
time.

The common header follows the getLeastSuperColumn contract: a column stays Const
only when every branch is Const with the same value, otherwise it is
materialized. Comparing only constness (not the value) would keep the front
branch's Const when two branches fold to different constants, after which
makeConvertingActions fails trying to convert one branch to the other branch's
Const value.

liftUpUnion assumed the union forwards each branch unchanged and rebuilt the
UnionStep forcing every branch input header to the lifted step's full output.
When checkHeaders materialized a divergent Const, the lifted union then declared
a full input header for a branch still exposing a Const, moving the mismatch
into the optimizer's own header validation. It now skips the rewrite when the
union output differs from any branch input header.

Closes ClickHouse#106956

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai groeneai force-pushed the groeneai/fix-union-const-fold-header-106956 branch from ffd0383 to cfe8828 Compare June 17, 2026 20:29
@groeneai

Copy link
Copy Markdown
Contributor Author

Fixed in cfe8828. Took route (b): the common-header build now keeps a column Const only when every branch is Const with the same getField() value, otherwise it materializes (convertToFullColumnIfConst). This is the getLeastSuperColumn save_constness rule without reintroducing the supertype machinery, since the strict name/type/representation check already runs first. Applied identically in UnionStep and IntersectOrExceptStep.

While validating the all-Const-different-value case through actual execution I found the materialized output then aborted one pass later, after optimization liftUpUnion (optimizeTree.cpp header validation). tryLiftUpUnion assumes the union forwards each branch unchanged and rebuilds the UnionStep forcing every branch input header to the lifted step's full output, so a branch still exposing a Const no longer matches. It now skips the rewrite when the union output differs from any branch input header (i.e. the union normalized something).

Repro that exercises the execution path (both branches keep rows, same column folds to 1 and 0):

SELECT a, c FROM (
    SELECT t2.a AS a, (t1.b = t2.b) AS c
    FROM (SELECT 1 AS a, 2 AS b) AS t1 INNER JOIN (SELECT 1 AS a, 2 AS b) AS t2 ON t1.a = t2.a WHERE t1.b = t2.b
    UNION ALL
    SELECT t2.a AS a, NOT (t1.b = t2.b) AS c
    FROM (SELECT 1 AS a, 2 AS b) AS t1 INNER JOIN (SELECT 1 AS a, 2 AS b) AS t2 ON t1.a = t2.a WHERE t1.b = t2.b
) ORDER BY a, c;

On master this aborts at the original UnionStep::checkHeaders; before this commit it aborted at liftUpUnion. Test 04327 now covers UNION ALL / UNION DISTINCT / INTERSECT / EXCEPT for the all-Const-different-value case and passes 10/10 under randomization. Verified Sparse/full and Replicated/full are still rejected (convertToFullColumnIfConst leaves those wrappers untouched, so the strict compare rejects them).

session: cron:clickhouse-worker-slot-3:20260617-191100

@groeneai

Copy link
Copy Markdown
Contributor Author

@KochetovNicolai CI surfaced a regression from the current commit (cfe88282) that I think needs your call before I push, since both fix locations touch contract-bearing code.

Repro (old analyzer only; pre-existing test 01418_query_scope_constants_and_remote):

SET enable_analyzer = 0;
SELECT c >= 0 FROM (SELECT randConstant() AS c FROM remote('127.0.0.{1,2}', numbers_mt(1)));
-- Code: 44. Cannot convert column `c` because it is non constant in source stream but must be constant in result.

Root cause: remote(2 shards) builds an internal UnionStep (ClusterProxy) whose branches are both Const c but with different per-shard values (randConstant bakes a value into each branch header). The same-value rule materializes c to full in the union output (correct for set-op semantics). But the old analyzer computes the surrounding result_header from the ExpressionAnalyzer sample block (Const c), independently of the plan, and InterpreterSelectQuery::buildQueryPlan then converts plan(full c) -> result_header(const c) and throws. The new analyzer is fine: the Planner derives the common header from the same getLeastSuperColumn over plan headers, so plan and result_header always agree.

The same-value rule should stay (dropping it brings back a silent wrong result for c=1 UNION ALL c=0, and the original #106956 crash). The mismatch is purely the old analyzer's separately-computed Const result_header not matching a column the plan legitimately materialized to full.

Two fix points, both with caveats:

  1. makeConvertingActions: when ignore_constant_values = true and the result is Const but the source is full, keep the source full instead of throwing. Smallest, but makeConvertingActions is shared by many callers.
  2. InterpreterSelectQuery::buildQueryPlan (old analyzer only): reconcile result_header constness with the actual plan header before converting. Localized, but result_header also feeds getSampleBlock(), so it changes what an enclosing subquery sees.

Do you have a preference, or a third option I'm missing?

@Algunenano

Copy link
Copy Markdown
Member

@groeneai The failure to 01418_query_scope_constants_and_remote is related.

@clickhouse-gh

clickhouse-gh Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.30% 85.20% -0.10%
Functions 92.30% 92.30% +0.00%
Branches 77.50% 77.40% -0.10%

Changed lines: Changed C/C++ lines covered by tests: 61/61 (100.00%) | Lost baseline coverage: none · Uncovered code

Full report · Diff report

@groeneai

Copy link
Copy Markdown
Contributor Author

@Algunenano Yes, that one is the known PR-caused regression from this commit (cfe88282), already root-caused, not infra.

Mechanism: remote('127.0.0.{1,2}', ...) builds an internal UnionStep whose two shard branches are both Const c but with different baked randConstant() values, so the new same-value rule materializes c to full in the union output (correct for set-op semantics). Under the old analyzer the surrounding result_header is computed separately from the ExpressionAnalyzer sample block as Const c, so InterpreterSelectQuery::buildQueryPlan then converts plan(full c) -> result_header(Const c) and throws Code 44. The new analyzer is fine: the Planner derives the common header from the same plan headers, so plan and result_header always agree.

The same-value materialize rule should stay (dropping it reintroduces a silent wrong result for c=1 UNION ALL c=0, plus the original #106956 crash). The clean reconciliation touches contract-bearing code, so I posed two options to KochetovNicolai and am holding the push until a maintainer picks: (1) makeConvertingActions, when ignore_constant_values=true and the result is Const but the source is full, keep the source full instead of throwing (smallest, but shared by many callers); (2) old-analyzer buildQueryPlan, reconcile result_header constness with the actual plan header before converting (localized, but result_header also feeds getSampleBlock()). Full analysis and repro: #107719 (comment #107719 (comment)).

Do you have a preference between the two? I lean to (2) as the more localized change, but it is your and KochetovNicolai's call.

@Algunenano

Copy link
Copy Markdown
Member

Do you have a preference between the two? I lean to (2) as the more localized change, but it is your and KochetovNicolai's call.

@KochetovNicolai Can you please have a look? I think breaking the old analyzer isn't ok, and adding more code there seems like an ok thing to avoid adding complexity to the code that should remain in the future. So I'd prefer option 2. WDTY?

@groeneai

Copy link
Copy Markdown
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants