Fix inconsistent AST formatting for aliased IN inside a function call (STID 1941-1bfa) by tuanpach · Pull Request #105091 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix inconsistent AST formatting for aliased IN inside a function call (STID 1941-1bfa)#105091

Merged
alexey-milovidov merged 10 commits into
ClickHouse:masterfrom
tuanpach:fix-ast-in-alias-nested-1941-1bfa
May 31, 2026
Merged

Fix inconsistent AST formatting for aliased IN inside a function call (STID 1941-1bfa)#105091
alexey-milovidov merged 10 commits into
ClickHouse:masterfrom
tuanpach:fix-ast-in-alias-nested-1941-1bfa

Conversation

@tuanpach

@tuanpach tuanpach commented May 15, 2026

Copy link
Copy Markdown
Member

Motivation

The format-parse-format consistency check in DB::executeQueryImpl aborts
debug / sanitiser builds (Logical error: 'Inconsistent AST formatting between 'ExpressionList' and 'ExpressionList', STID 1941-1bfa) on queries
like:

SELECT g(1, 2 IN ((3 IN (4, 5)) AS x))

The first format emits g(1, (2 IN ((3 IN (4, 5)) AS x))); the re-parse
sets parenthesized=true on the outer IN, and the second format drops
the inner (3 IN (4, 5)), breaking round-trip stability.

Root cause

Two code paths emit the wrapping (...) around the outer IN, but they
leave frame.current_function in different states for descendants:

Pass Outer wrap emitted by current_function after Inner IN wraps itself?
1st format need_parens_around_in in ASTFunction::formatImplWithoutAlias unchanged (= g) yes → ((3 IN (4, 5)) AS x)
2nd format decideParensEmission in IAST::format (because re-parse set parenthesized=true) reset to nullptr no → (3 IN (4, 5) AS x)

The asymmetric reset of current_function is what makes the inner IN
emit its own wrapping parens in one pass but not the other.

Change

Make the two paths symmetric: when formatImplWithoutAlias emits the
wrap via need_parens_around_in, also clear current_function for
descendants — exactly what decideParensEmission already does on the
other path. Both passes then produce
SELECT g(1, (2 IN (3 IN (4, 5) AS x))) and the round-trip is stable.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=103126&sha=9ebf9fa83a0839baba782baf9689bc849215dc5a&name_0=PR&name_1=AST%20fuzzer%20%28amd_debug%2C%20targeted%2C%20old_compatibility%29
Found while CI-validating PR #103126.

Note: this is one subvariant of the chronic STID 1941-1bfa. Other
subvariants (CODEC / STATISTICS — open PR #104991, lambda-after-comma
— merged PR #104626, etc.) are independent.

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Not for changelog.

Documentation entry for user-facing changes

  • Documentation is unchanged (formatter-internal round-trip stability; no user-visible API or syntax change).

Version info

  • Merged into: 26.6.1.283

… (STID 1941-1bfa)

A query like `f(1, 2 IN ((3 IN (4, 5)) AS x))` triggered a
`Logical error: 'Inconsistent AST formatting between 'ExpressionList' and
'ExpressionList'` (STID 1941-1bfa) abort in debug / sanitiser builds
because the format-parse-format round-trip in `DB::executeQueryImpl`
diverged by the inner `(3 IN (4, 5))` parens.

`ASTFunction::formatImplWithoutAlias` wraps an `IN` expression in `(...)`
via `need_parens_around_in` whenever the IN sits inside a multi-argument
function call (`frame.current_function != nullptr`), so the first format
emits `f(1, (2 IN ((3 IN (4, 5)) AS x)))`. The re-parse sets
`parenthesized=true` on the outer `IN`, and on the second format
`IAST::format` emits the outer parens via the `parenthesized` path in
`decideParensEmission`, which clears `frame.current_function`. The inner
`IN` then sees `in_function_args == false` and stops emitting its own
`(...)`, so the inner `(3 IN (4, 5))` parens disappear and the second
format differs from the first.

Fix: when the outer `IN`'s `formatImplWithoutAlias` emits the wrapping
`(...)` via `need_parens_around_in`, also clear `current_function` for
descendants. Both wrapping paths (this one and the `parenthesized` path
in `IAST::format`) now produce the same output, so the round-trip is
stable.

CI: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=103126&sha=9ebf9fa83a0839baba782baf9689bc849215dc5a&name_0=PR&name_1=AST%20fuzzer%20%28amd_debug%2C%20targeted%2C%20old_compatibility%29
PR: ClickHouse#103126
@clickhouse-gh

clickhouse-gh Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [53a7b79]

Summary:


AI Review

Summary

This PR fixes formatter round-trip instability for nested aliased IN inside multi-argument function arguments by making current_function handling symmetric across both parenthesis-emission paths, and adds targeted stateless tests that assert formatter idempotence directly. I reviewed the current code and prior discussion threads; the previously raised test-contract concern is addressed, and I found no remaining correctness, safety, or evidence gaps in the touched surface.

Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-ci label May 15, 2026
@alexey-milovidov

Copy link
Copy Markdown
Member

This was fixed by #105146. Let's update the branch.

Comment thread tests/queries/0_stateless/04239_consistent_format_in_alias_nested_1941_1bfa.sql Outdated
@yariks5s yariks5s self-assigned this May 19, 2026
tuanpach added 3 commits May 21, 2026 05:48
Per ClickHouse#105091 (comment),
replace each `SELECT formatQuerySingleLine(q)` with the explicit invariant
the test is trying to express:

    formatQuerySingleLine(formatQuerySingleLine(q)) = formatQuerySingleLine(q)

which is exactly `format_2 == format_1` — the assertion `DB::executeQueryImpl`
runs in debug / sanitiser builds. The previous form pinned the first-pass
output text and would still pass if a future regression kept the first pass
unchanged but broke the second pass; the equality form proves the
round-trip-stability invariant directly. Reference becomes a column of `1`s.
@clickhouse-gh

clickhouse-gh Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.10% +0.00%
Functions 91.40% 91.40% +0.00%
Branches 76.50% 76.60% +0.10%

Changed lines: 100.00% (8/8) · Uncovered code

Full report · Diff report

@alexey-milovidov alexey-milovidov self-assigned this May 31, 2026
@alexey-milovidov alexey-milovidov added this pull request to the merge queue May 31, 2026
Merged via the queue into ClickHouse:master with commit 0dcd55b May 31, 2026
166 checks passed
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label May 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants