Add STRING_AGG as alias of groupConcat by alexey-milovidov · Pull Request #105125 · ClickHouse/ClickHouse · GitHub
Skip to content

Add STRING_AGG as alias of groupConcat#105125

Merged
alexey-milovidov merged 3 commits into
masterfrom
alias-string-agg
May 19, 2026
Merged

Add STRING_AGG as alias of groupConcat#105125
alexey-milovidov merged 3 commits into
masterfrom
alias-string-agg

Conversation

@alexey-milovidov

@alexey-milovidov alexey-milovidov commented May 16, 2026

Copy link
Copy Markdown
Member

PostgreSQL/SQL-standard STRING_AGG(expr, sep) matches ClickHouse's existing groupConcat(expr, sep) exactly when the separator is passed as a regular argument. Adding a case-insensitive alias spares PostgreSQL-dialect workloads (such as the SQLStorm corpus) a rewrite.

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Added STRING_AGG as a case-insensitive alias of groupConcat for PostgreSQL/SQL-standard compatibility.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Version info

  • Merged into: 26.5.1.850

PostgreSQL/SQL-standard `STRING_AGG(expr, sep)` matches ClickHouse's
existing `groupConcat(expr, sep)` exactly when the separator is passed
as a regular argument. Expose `STRING_AGG` as a case-insensitive alias
so PostgreSQL-dialect queries (e.g., the SQLStorm corpus) do not need
rewriting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@clickhouse-gh

clickhouse-gh Bot commented May 16, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-improvement Pull request with some product improvements label May 16, 2026
Comment thread src/AggregateFunctions/AggregateFunctionGroupConcat.cpp Outdated
alexey-milovidov added a commit that referenced this pull request May 16, 2026
Recent compatibility PRs added case-insensitive aliases and parser sugar
that make several of the SQLStorm rewrites unnecessary:

  - `STDDEV`            -> `stddevPop`            (#105120)
  - `array_to_string`   -> `arrayStringConcat`    (#105121)
  - `REGEXP_SUBSTR`     -> `regexpExtract`        (#105122)
  - `CARDINALITY`       -> `length`               (#105123)
  - `unnest()` function -> `arrayJoin()`          (#105124)
  - `STRING_AGG`        -> `groupConcat`          (#105125)
  - `date_part(unit,e)` -> `EXTRACT(unit FROM e)` (#105127)
  - `expr OP ANY/ALL(array_literal)`              (#105129)

`ARRAY_AGG`, `TRANSLATE`, and `EXTRACT(EPOCH|DOW|... FROM ...)` were
already supported by ClickHouse before these PRs.

Removed the corresponding rewrite calls and helper functions
(`rewrite_string_agg`, `rewrite_array_agg`, `rewrite_date_part`,
`rewrite_stddev`, `rewrite_extract_epoch`, the EXTRACT(DOW) inline
rewrite, `rewrite_any_comparison`, and the trailing
`unnest(...) -> arrayJoin(...)` substitution). Also dropped the
unreferenced no-op helpers (`rewrite_extract_unit`, `rewrite_fetch_offset`,
`rewrite_interval`, `rewrite_cast_timestamp`, `rewrite_current_timestamp`,
`rewrite_bool_literals`, `rewrite_ilike`, `rewrite_no_supertype`).

The PostgreSQL `LATERAL` / `CROSS JOIN UNNEST(...)` table-source forms,
`arrayJoin(...)` in JOIN position, PG-specific casts, `AT TIME ZONE`,
`STRING_AGGDistinct` (a mangled-name artifact), and the still-unsupported
function rewrites (`string_to_array`, `regexp_split_to_array`, `RANDOM`,
`TO_TIMESTAMP`, `ARRAY_LENGTH`, `SPLIT_PART`, `age`) are still rewritten.

Net change: -329 lines from rewrite_queries.py and -75 lines from the
tests (the `TestRewriteAnyComparison` class is removed since the rewrite
it covered no longer exists).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The analyzer rewrites the 2-argument call `groupConcat(expr, sep)` into
the parameterized form `groupConcat(sep)(expr)` only for names listed in
`GroupConcatImpl<false>::getNameAndAliases` (see
`QueryTreeBuilder::setSecondArgumentAsParameter`). The previous alias
was registered through `registerAlias` only, so `STRING_AGG(expr, sep)`
hit the unary check in `createAggregateFunctionGroupConcat` and failed
with `NUMBER_OF_ARGUMENTS_DOESNT_MATCH`.

Add `string_agg` to `getNameAndAliases` so the rewrite triggers, and
register every alias from that list in
`registerAggregateFunctionGroupConcat`.

Failure: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=105125&sha=28eefc28ebcdb622ae3d79b79d96e3563e495faa&name_0=PR&name_1=Fast%20test
PR: #105125

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@m-selmi m-selmi self-assigned this May 18, 2026
The 2-argument form `STRING_AGG(expr, sep)` (and equivalently
`groupConcat(expr, sep)`) is rewritten into the parameterized form
`STRING_AGG(sep)(expr)` only by the new analyzer (see
`QueryTreeBuilder::setSecondArgumentAsParameter`). The old analyzer
sees two positional arguments and fails with
`NUMBER_OF_ARGUMENTS_DOESNT_MATCH`.

The test fails under "Stateless tests (amd_llvm_coverage, old analyzer,
s3 storage, DatabaseReplicated, WasmEdge, parallel)":
https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=105125&sha=d5baa1fd637c78da6f35bf52e576d0c77223ea48&name_0=PR&name_1=Stateless%20tests%20%28amd_llvm_coverage%2C%20old%20analyzer%2C%20s3%20storage%2C%20DatabaseReplicated%2C%20WasmEdge%2C%20parallel%29

Pin the 2-argument queries to `enable_analyzer=1`, matching the
existing convention in `03156_group_concat.sql`. The 1-argument form
remains analyzer-agnostic.

PR: #105125

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@clickhouse-gh

clickhouse-gh Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.20% 84.10% -0.10%
Functions 91.40% 91.40% +0.00%
Branches 76.60% 76.50% -0.10%

Changed lines: 100.00% (7/7) · Uncovered code

Full report · Diff report

@alexey-milovidov alexey-milovidov merged commit 299754b into master May 19, 2026
165 of 166 checks passed
@alexey-milovidov alexey-milovidov deleted the alias-string-agg branch May 19, 2026 16:18
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants