Fix inconsistent AST formatting for kql_array_sort subscript followed by an operator (STID 1941-1bfa) by groeneai · Pull Request #107634 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix inconsistent AST formatting for kql_array_sort subscript followed by an operator (STID 1941-1bfa)#107634

Closed
groeneai wants to merge 2 commits into
ClickHouse:masterfrom
groeneai:groeneai/fix-kql-array-sort-subscript-operator-1941-1bfa
Closed

Fix inconsistent AST formatting for kql_array_sort subscript followed by an operator (STID 1941-1bfa)#107634
groeneai wants to merge 2 commits into
ClickHouse:masterfrom
groeneai:groeneai/fix-kql-array-sort-subscript-operator-1941-1bfa

Conversation

@groeneai

Copy link
Copy Markdown
Contributor

Related: #105396

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix a LOGICAL_ERROR ("Inconsistent AST formatting") when a [i] subscript on a kql_array_sort_asc/kql_array_sort_desc result is followed by another operator, for example kql_array_sort_desc([...])[i].j.

Description

CI report (BuzzHouse, amd_debug, STID 1941-1bfa): https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=107210&sha=61447331cad5ae84cca31abe742548cfa907b3db&name_0=PR&name_1=BuzzHouse%20%28amd_debug%29

kql_array_sort_asc/kql_array_sort_desc return Tuple(Array, ...), so a [i] subscript on their result is tuple-element access and the parser rewrites the arrayElement operator to tupleElement. That rewrite lived only in Layer::mergeElement. When the subscript is immediately followed by another equal-or-lower-priority operator (.j, [k], +, =, ...), the arrayElement node is instead built by the operator-merge loop in ParserExpressionImpl::tryParseOperator, which had no rewrite, so a raw arrayElement(kql_array_sort_*, i) survived. The formatter emits that as [i], which re-parses (now with no trailing operator) through the rewrite as tupleElement, so the format-parse-format round-trip diverged and the consistency check in executeQueryImpl aborted the server with a LOGICAL_ERROR in debug and sanitizer builds.

Minimal reproducer (aborts the server on master, returns a clean type error with this fix):

SELECT kql_array_sort_desc([3, 1, 2])[0].4;

The rewrite is factored into Layer::foldKqlArraySortSubscript and applied at both arrayElement construction sites. It inspects the actual first argument of the just-built node instead of guessing from the operand stack, and keeps the operator-syntax guard so call-syntax arrayElement(kql_array_sort_*, ...) is preserved verbatim.

… by an operator (STID 1941-1bfa)

kql_array_sort_asc/kql_array_sort_desc return Tuple(Array, ...), so a [i]
subscript on their result is tuple-element access and the parser rewrites the
arrayElement operator to tupleElement. That rewrite lived only in
Layer::mergeElement. When the subscript is immediately followed by another
equal-or-lower-priority operator (.j, [k], +, =, ...), the arrayElement node is
instead built by the operator-merge loop in ParserExpressionImpl::tryParseOperator,
which had no rewrite, so a raw arrayElement(kql_array_sort_*, i) survived. The
formatter emits that as [i], which re-parses (now with no trailing operator)
through the rewrite as tupleElement, so the format-parse-format round-trip
diverged and the consistency check in executeQueryImpl aborted the server with a
LOGICAL_ERROR in debug and sanitizer builds.

The rewrite is factored into Layer::foldKqlArraySortSubscript and applied at both
arrayElement construction sites. It inspects the actual first argument of the
just-built node instead of guessing from the operand stack, and keeps the
operator-syntax guard so call-syntax arrayElement(kql_array_sort_*, ...) is
preserved verbatim.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

Copy link
Copy Markdown
Contributor Author

@groeneai

Copy link
Copy Markdown
Contributor Author

cc @evillique @george-larionov -- could you review this? It fixes a debug-build server abort (Inconsistent AST formatting, STID 1941-1bfa) for kql_array_sort_*([...])[i] followed by another operator. The arrayElement->tupleElement subscript rewrite only ran in Layer::mergeElement, so the operator-merge loop in tryParseOperator left a non-round-trippable arrayElement(kql, i); the fix factors the rewrite into one helper applied at both construction sites.

@PedroTadim PedroTadim added the can be tested Allows running workflows for external contributors label Jun 16, 2026
@clickhouse-gh

clickhouse-gh Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [bb54b60]

Summary:

job_name test_name status info comment
Integration tests (amd_asan_ubsan, db disk, old analyzer, 4/6) ERROR
Timeout FAIL cidb

AI Review

Summary

This PR fixes the parser asymmetry for kql_array_sort_asc/kql_array_sort_desc subscript operators by applying the arrayElement to tupleElement fold at both AST construction sites. The focused regression test covers direct trailing operators, chained subscripts, call-syntax preservation, and executable semantics, and I did not find any unresolved correctness issue that warrants an inline review comment.

Final Verdict

Status: ✅ Approve. No required changes.

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Jun 16, 2026
@groeneai

Copy link
Copy Markdown
Contributor Author

CI summary for head 5a342b6. All three red checks are pre-existing failures unrelated to this parser-only change (KQL array_sort subscript fold). The new regression test 04338_kql_array_sort_subscript_then_operator_1941_1bfa passed (237 OK, 0 FAIL).

Finish Workflow is green; Mergeable Check is red only due to the non-PR-caused failures above.

@alexey-milovidov

Copy link
Copy Markdown
Member

@groeneai, fix flaky test 01710_projections and fix this stress test failure: Block structure mismatch in UnionStep.

@groeneai

Copy link
Copy Markdown
Contributor Author

@alexey-milovidov done.

01710_projections: the branch predated #107698 (merged 16:56Z), which pins index_granularity = 8192 so --random-merge-tree-settings can no longer inject a small value that splits the projection part across concurrent s3 read streams (the float accumulation-order diff). Branch updated to current master (bb54b60); CI is re-running on the new head.

Block structure mismatch in UnionStep (STID 0993-27f0): chronic trunk crash, not from this change. Present on master and several unrelated PRs over the last 30 days; it surfaces here only because the branch merges master. This PR touches only src/Parsers/ExpressionListParsers.cpp (no plan/optimizer code). Report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=107634&sha=5a342b69fe28bf9c7cedb4b0a5ace9fa8160c469&name_0=PR&name_1=Stress%20test%20%28arm_asan_ubsan%2C%20s3%29 . Fix in flight: #107719.

@clickhouse-gh

clickhouse-gh Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.20% 85.30% +0.10%
Functions 92.30% 92.30% +0.00%
Branches 77.40% 77.50% +0.10%

Changed lines: Changed C/C++ lines covered by tests: 22/24 (91.67%) | Lost baseline coverage (was covered on master, now uncovered in this PR): 3 line(s) · Uncovered code

Full report · Diff report

@groeneai

Copy link
Copy Markdown
Contributor Author

CI fully finished on bb54b60. The only red is Integration tests (amd_asan_ubsan, db disk, old analyzer, 4/6), and it is an infra batch-timeout, not a test failure: the job log ends with 1219 passed, 22 skipped, 13 warnings in 7236.66s (2:00:36) followed by session-timeout occurred during test execution. Zero pytest failures; the batch (heavy iceberg-spark + s3 suites under ASan) just hit the 2h session cap. This PR only touches src/Parsers/ExpressionListParsers.cpp, which cannot affect integration-test runtime.

Report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=107634&sha=bb54b60df7766ca7f7801181737f140be90a362d&name_0=PR&name_1=Integration%20tests%20%28amd_asan_ubsan%2C%20db%20disk%2C%20old%20analyzer%2C%204%2F6%29

The two failures from the directive are green on this head: 01710_projections (picked up merged #107698) and the UnionStep stress test (chronic trunk STID 0993-27f0, fix in flight at #107719). Tracking the batch timeout separately as CI infra.

@groeneai groeneai mentioned this pull request Jun 20, 2026
21 tasks
@alexey-milovidov

Copy link
Copy Markdown
Member

@groeneai, we don't like KQL, because it is experimental and was implemented with low quality.

Send another PR that will remove kql_array_sort_asc, kql_array_sort_desc with all associated complications, including all traces of it inside code, tests, and docs.

@groeneai

Copy link
Copy Markdown
Contributor Author

Done: #108101 removes kql_array_sort_asc/kql_array_sort_desc and the KQL array_sort_asc/array_sort_desc entirely (code, tests, docs, fuzzer dicts), including the subscript-rewrite special case this PR was fixing and the array_sort-only KQL alias handling.

That makes this PR moot: with the functions gone, the [i] subscript on their result no longer reaches the tupleElement rewrite, so the inconsistent-AST LOGICAL_ERROR cannot occur. #108101 closes #106850 directly. I will close this one once #108101 is approved; let me know if you would rather keep it open.

@Algunenano

Copy link
Copy Markdown
Member

@Algunenano Algunenano closed this Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants