{{ message }}
Stabilize 02180_group_by_lowcardinality (non-deterministic LIMIT without ORDER BY)#105194
Merged
Algunenano merged 1 commit intoMay 18, 2026
Conversation
The test asserted a deterministic row-by-row output from: SELECT val, avg(toUInt32(val)) FROM t_group_by_lowcardinality GROUP BY val LIMIT 10 SETTINGS max_threads = 1, max_rows_to_group_by = 100, group_by_overflow_mode = 'any'; With `group_by_overflow_mode = 'any'`, only the first `max_rows_to_group_by` distinct keys make it into the hash table — and which keys those are depends on the order in which rows arrive at the aggregator. `LIMIT 10` without `ORDER BY` then selects ten of those keys in hash-iteration order. The `max_threads = 1` clause does not help under `ParallelReplicas` (where the table is sharded across replicas and partial aggregations are merged) or under S3 storage (where reads can be split into independent ranges), so the resulting row set varies between runs. Originally tracked in ClickHouse#36069 (2022). The `-- Tags: no-random-settings` workaround masked the issue for random-settings randomization but the underlying non-determinism returned once `ParallelReplicas` was enabled by default in some CI configurations. Concretely the test now fails ~10–20 times per day on master under `Stateless tests (amd_llvm_coverage, ParallelReplicas, s3 storage, parallel)` — @alexey-milovidov flagged this on ClickHouse#102039. The test is a crash-regression test for "Avoid crash in case of GROUP BY LowCardinality(Nullable(String)) column and group_by_overflow_mode='any'" (PR ClickHouse#29637 / ClickHouse#56057). The query is preserved verbatim, including all the problematic settings; only the assertion is changed to `SELECT count() FROM (...) `, which is always 10. If the underlying crash ever returns, the test will detect it; the specific row values were never deterministic to begin with. Verified locally that the old reference fails under `max_threads = 4` (extra null bucket appears, last row drops — exact diff matches the CI report). The new assertion returns `{"count()":10}` for all parallelism levels. Closes ClickHouse#36069
Contributor
Author
Contributor
Author
|
cc @KochetovNicolai @alexey-milovidov — could you review this test-only fix? The original assertion was non-deterministic under |
1 task
Contributor
alexey-milovidov
approved these changes
May 18, 2026
Merged
1 task
1 task
This was referenced May 18, 2026
groeneai
added a commit
to groeneai/ClickHouse
that referenced
this pull request
May 20, 2026
Pick up fix ClickHouse#105194 (02180_group_by_lowcardinality stabilization) and other master fixes to clear chronic flakes on PR ClickHouse#105106 CI.
This was referenced May 20, 2026
groeneai
added a commit
to groeneai/ClickHouse
that referenced
this pull request
May 21, 2026
Brings in stabilization of `02180_group_by_lowcardinality` (PR ClickHouse#105194) which was the only CI failure on commit ec409ae -- it had merged on master after this branch's last sync.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

The stateless test
02180_group_by_lowcardinalityasserts a row-by-row deterministic output from a query that is inherently non-deterministic. The test currently fails ~10–20 times per day on master underStateless tests (amd_llvm_coverage, ParallelReplicas, s3 storage, parallel)and has been failing across 112 distinct PRs over the last 30 days.@alexey-milovidov flagged this on #102039.
Root cause
The query is:
With
group_by_overflow_mode = 'any', only the firstmax_rows_to_group_bydistinct keys make it into the hash table, and which keys those are depends on the order in which rows arrive at the aggregator.LIMIT 10withoutORDER BYthen selects ten of those keys in hash-iteration order.The
max_threads = 1clause does not help underParallelReplicas(where the table is sharded across replicas and partial aggregations are merged) or under S3 storage (where reads can be split into independent ranges), so the resulting row set varies between runs.Locally reproduced the exact CI failure diff by changing
max_threads = 1tomax_threads = 4: an extra{"val":null,"avg(toUInt32(val))":null}appears at the top and the last row drops off — identical to the diff shown in the CI report.Fix
This test is a crash-regression test for "Avoid crash in case of GROUP BY LowCardinality(Nullable(String)) column and group_by_overflow_mode='any'" (PR #29637 / #56057). The query is preserved verbatim, including all the problematic settings; only the assertion is changed to
SELECT count() FROM (...), which is always 10 (the LIMIT value).If the underlying crash ever returns, or
LIMITstops returning the expected number of rows, the test will detect it. The specific row values were never deterministic to begin with.Originally tracked in #36069 (2022). The
-- Tags: no-random-settingsworkaround masked the issue for the random-settings randomization but the underlying non-determinism returned onceParallelReplicaswas enabled by default in some CI configurations.CI report (one of many): https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=102039&sha=eb773718e41db7bafe973e9a67a6a3a195b48fe9&name_0=PR&name_1=Stateless%20tests%20%28amd_llvm_coverage%2C%20ParallelReplicas%2C%20s3%20storage%2C%20parallel%29
Closes #36069
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Not for changelog (CI Fix or Improvement).
Documentation entry for user-facing changes
Version info
26.5.1.742