Preserve existing parameters in groupConcat two-argument form#104882
Conversation
scanhex12
left a comment
There was a problem hiding this comment.
looks good, but there is one more test to fix
The flaky check randomized `max_threads`/`aggregation_in_order`, which split the 5-row scan across threads and let rows arrive at the aggregator in a different order than the table's `ORDER BY x`. The limit then kept the wrong rows. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104882&sha=70e845a033a99309f793439af79cfeddc43b5312&name_0=PR Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous fix only set `max_threads = 1` per `SELECT`, so the random flaky-check setting `max_insert_threads = 2` could still split the 5-row `INSERT` into two parts. When the part holding `[1,2,3,4]` was written before the one holding `[0]`, the read order became `1,2,3,4,0` and the limited outputs picked the wrong rows. Apply session-level `SET max_insert_threads = 1, max_threads = 1, min_insert_block_size_rows = 0, min_insert_block_size_bytes = 0` and `OPTIMIZE TABLE ... FINAL` to guarantee a single ordered part, matching the pattern in `03156_group_concat.sql`. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104882&sha=2edb3a55709250d1e767650200b6542cb15ddc30 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The flaky check also randomizes `enable_parallel_replicas = 1` and `parallel_replicas_for_non_replicated_merge_tree = 1`, which splits the 5-row scan across replicas and reorders rows when results are merged — yielding the `1,2,3,4,0` pattern even with `max_threads = 1` and a single part. Pin both off via session-level `SET`. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104882&sha=a80143bd52713b67cd92f957f8372710adb3cd1a Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The flaky check applies its random settings as URL parameters on every HTTP request, which take per-query precedence over session-level `SET` and over per-query `SETTINGS` from prior queries. So even with the previous defensive `SET max_insert_threads = 1` etc., the `INSERT` still ran with `max_insert_threads >= 2`, created multiple parts, and the resulting read order was non-deterministic. Drop the order-dependent reference and insert identical rows instead. The test still checks both that the limit (`2`/`3`/`100`) and the overridden delimiter (`/`, `|`) are honoured. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104882&sha=ae4473b6606021924841c341bc1f5a08fa8bcfa9 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
📊 Cloud Performance Report 🔴 AI verdict: This PR makes two narrow changes: fixing groupConcat to preserve its limit parameter when a second delimiter argument is supplied (analyzer/query-tree path), and carrying a parent column's DEFAULT into projection metadata. Neither touches the join or aggregation execution that the flagged TPC-H queries run. Q4 (+185%) and Q7 (+779%) are large enough to clear the automatic trust threshold so their verdicts stand, but their baseline runs were extremely noisy and the query paths are unrelated to the diff -- a re-run is recommended before treating them as real. Q8, Q16, and ClickBench Q33 are off-path and consistent with run-to-run variance, so they were downgraded to not-sure. clickbenchFlagged queries (1 of 43)
q-value = BH-FDR adjusted p; smaller is stronger evidence. MIRAI flags a query when q < fdr_q (default 0.10) — the value the verdict is based on. tpch_adapted_1_official🔴 2 regressed · Flagged queries (4 of 22)
q-value = BH-FDR adjusted p; smaller is stronger evidence. MIRAI flags a query when q < fdr_q (default 0.10) — the value the verdict is based on. Debug info
|
|
Dear @scanhex12, you haven't been active on this PR for 30 days. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself. |
…ting-parameters-groupConcat
LLVM Coverage Report
Changed lines: Changed C/C++ lines covered: 13/13 (100.00%) · Uncovered code |
|
Found via ClickGap automated review. Please close or comment if this is incorrect or needs adjustment. TL;DRPreserve existing parameters in groupConcat two-argument form ClickGap verified: reproduced it with the script below on ReproductionBisect
Affects
Suggested next stepThe fix has landed on master; the affected release branches above still need the backport. Component: ClickGap · Finding: Triage metadata
|

Reference: #72540 (comment).
Before, a query that mixes both spellings — e.g.
groupConcat(',', 2)(x, '/')— silently dropped the2and returned every row instead of the requested twoChangelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Preserves the original parameters and lets the second-argument delimiter
override position 0 (the two-argument spelling is the more specific one
and already documented to "override current parameter" on the test
above this case)
Version info
26.7.1.483