{{ message }}
Fix logical error in correlated subqueries with group_by_use_nulls#108714
Closed
coderashed wants to merge 2 commits into
Closed
Fix logical error in correlated subqueries with group_by_use_nulls#108714coderashed wants to merge 2 commits into
coderashed wants to merge 2 commits into
Conversation
A correlated scalar subquery referencing an outer `GROUP BY` key that becomes `Nullable` via `group_by_use_nulls` (`GROUPING SETS` / `ROLLUP` / `CUBE` / `WITH TOTALS`) aborted with: Logical error: Unexpected return type from toString. Expected String. Got Nullable(String) The `group_by_use_nulls` nullability walk in `resolveExpressionNode` stops at the subquery's own query scope, so a correlated column kept its non-`Nullable` type while decorrelation fed in the real `Nullable` column, mismatching a baked function result type (`toString` -> `String` vs `Nullable(String)`). Continue the walk past a subquery's query boundary for correlated columns and convert the column to `Nullable` in place, preserving the node identity shared with the subquery's correlated-columns set (which the planner matches by identity). Discovered by the AST fuzzer in the `amd_msan` stress test: https://github.com/ClickHouse/ClickHouse/actions/runs/28286443407/job/83839128485?pr=108685 Related: ClickHouse#108685
Add the query shape the AST fuzzer aborted on in the `amd_msan` stress test of ClickHouse#108685 (multi-column projection with a correlated scalar subquery over a `GROUPING SETS` key under `WITH TOTALS`), scaled down and deterministic, so the regression test ties directly to the reported failure.
e8a374c to
fdfdbbd
Compare
Author
Contributor
1 task
groeneai
added a commit
to groeneai/ClickHouse
that referenced
this pull request
Jun 30, 2026
Two test cases contributed by @coderashed (who closed his duplicate PR ClickHouse#108714 in favour of this one): 1. The exact AST-fuzzer seed query from ClickHouse#108714: a correlated scalar subquery toString(number) inside concat() over a GROUPING SETS key. Aborts on master with 'Bad cast ColumnNullable to ColumnString', returns s0..s4 with the fix. 2. A type-pinning EXPLAIN QUERY TREE check. The existing cases assert query results, which still pass if a future change silently stops wrapping the correlated key as Nullable (values unchanged, declared type wrong). These pin the analyzer-assigned types via SELECT ... FROM (EXPLAIN QUERY TREE ...) WHERE explain ILIKE '%...%', so dropping the Nullable wrapping fails with a clean diff. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Closes: #107445
Closes: #106377
Related: #107951
Related: #108685
Related: #104350
Fixes a logical error in correlated scalar subqueries when an outer
GROUP BYkey is madeNullablebygroup_by_use_nulls(throughGROUPING SETS/ROLLUP/CUBE/WITH TOTALS).A correlated column referencing such an outer key was left non-
Nullableby the analyzer: thegroup_by_use_nullsnullability walk inresolveExpressionNodestops at the subquery's own query scope and never reaches the outer scope that owns the key. At decorrelation time the plan feeds in the realNullablecolumn, so a baked function result type (e.g.toString->String) no longer matches the actualNullable(String), aborting with:The fix lets the scope walk continue past a subquery's query boundary for correlated columns and converts the correlated column to
Nullablein place, preserving the node identity shared with the subquery's correlated-columns set (which the planner matches by identity).How it was found
Discovered by the AST fuzzer in the
amd_msanstress test of #108685: https://github.com/ClickHouse/ClickHouse/actions/runs/28286443407/job/83839128485?pr=108685The exact query the fuzzer aborted on (seeded from
02122_parallel_formatting_CSV):Minimal reproducer:
Testing
04413_correlated_subquery_group_by_use_nullscoversGROUPING SETS/ROLLUP/CUBE/WITH TOTALS, asserts the correlated result isNullable, checks a non-correlated subquery is unaffected, and includes the exact (scaled-down) query shape from the CI failure above.*correlated*,*group_by_use_nulls*and*grouping_sets*stateless tests pass locally with the fix.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Fix a logical error (
Unexpected return type ...) when a correlated scalar subquery references an outerGROUP BYkey that becomesNullabledue togroup_by_use_nulls(GROUPING SETS/ROLLUP/CUBE/WITH TOTALS).