{{ message }}
Fix inconsistent AST formatting for aliased IN inside a function call (STID 1941-1bfa)#105091
Merged
alexey-milovidov merged 10 commits intoMay 31, 2026
Merged
Conversation
… (STID 1941-1bfa) A query like `f(1, 2 IN ((3 IN (4, 5)) AS x))` triggered a `Logical error: 'Inconsistent AST formatting between 'ExpressionList' and 'ExpressionList'` (STID 1941-1bfa) abort in debug / sanitiser builds because the format-parse-format round-trip in `DB::executeQueryImpl` diverged by the inner `(3 IN (4, 5))` parens. `ASTFunction::formatImplWithoutAlias` wraps an `IN` expression in `(...)` via `need_parens_around_in` whenever the IN sits inside a multi-argument function call (`frame.current_function != nullptr`), so the first format emits `f(1, (2 IN ((3 IN (4, 5)) AS x)))`. The re-parse sets `parenthesized=true` on the outer `IN`, and on the second format `IAST::format` emits the outer parens via the `parenthesized` path in `decideParensEmission`, which clears `frame.current_function`. The inner `IN` then sees `in_function_args == false` and stops emitting its own `(...)`, so the inner `(3 IN (4, 5))` parens disappear and the second format differs from the first. Fix: when the outer `IN`'s `formatImplWithoutAlias` emits the wrapping `(...)` via `need_parens_around_in`, also clear `current_function` for descendants. Both wrapping paths (this one and the `parenthesized` path in `IAST::format`) now produce the same output, so the round-trip is stable. CI: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=103126&sha=9ebf9fa83a0839baba782baf9689bc849215dc5a&name_0=PR&name_1=AST%20fuzzer%20%28amd_debug%2C%20targeted%2C%20old_compatibility%29 PR: ClickHouse#103126
Contributor
|
Workflow [PR], commit [53a7b79] Summary: ✅ AI ReviewSummaryThis PR fixes formatter round-trip instability for nested aliased Final Verdict
|
1 task
Member
|
This was fixed by #105146. Let's update the branch. |
1 task
Per ClickHouse#105091 (comment), replace each `SELECT formatQuerySingleLine(q)` with the explicit invariant the test is trying to express: formatQuerySingleLine(formatQuerySingleLine(q)) = formatQuerySingleLine(q) which is exactly `format_2 == format_1` — the assertion `DB::executeQueryImpl` runs in debug / sanitiser builds. The previous form pinned the first-pass output text and would still pass if a future regression kept the first pass unchanged but broke the second pass; the equality form proves the round-trip-stability invariant directly. Reference becomes a column of `1`s.
Contributor
LLVM Coverage ReportChanged lines: 100.00% (8/8) · Uncovered code |
alexey-milovidov
approved these changes
May 31, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Motivation
The format-parse-format consistency check in
DB::executeQueryImplabortsdebug / sanitiser builds (
Logical error: 'Inconsistent AST formatting between 'ExpressionList' and 'ExpressionList', STID 1941-1bfa) on querieslike:
The first format emits
g(1, (2 IN ((3 IN (4, 5)) AS x))); the re-parsesets
parenthesized=trueon the outerIN, and the second format dropsthe inner
(3 IN (4, 5)), breaking round-trip stability.Root cause
Two code paths emit the wrapping
(...)around the outerIN, but theyleave
frame.current_functionin different states for descendants:current_functionafterneed_parens_around_ininASTFunction::formatImplWithoutAliasg)((3 IN (4, 5)) AS x)decideParensEmissioninIAST::format(because re-parse setparenthesized=true)nullptr(3 IN (4, 5) AS x)The asymmetric reset of
current_functionis what makes the innerINemit its own wrapping parens in one pass but not the other.
Change
Make the two paths symmetric: when
formatImplWithoutAliasemits thewrap via
need_parens_around_in, also clearcurrent_functionfordescendants — exactly what
decideParensEmissionalready does on theother path. Both passes then produce
SELECT g(1, (2 IN (3 IN (4, 5) AS x)))and the round-trip is stable.CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=103126&sha=9ebf9fa83a0839baba782baf9689bc849215dc5a&name_0=PR&name_1=AST%20fuzzer%20%28amd_debug%2C%20targeted%2C%20old_compatibility%29
Found while CI-validating PR #103126.
Note: this is one subvariant of the chronic STID 1941-1bfa. Other
subvariants (
CODEC/STATISTICS— open PR #104991, lambda-after-comma— merged PR #104626, etc.) are independent.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Not for changelog.
Documentation entry for user-facing changes
Version info
26.6.1.283