iframe-proxy

alexey-milovidov · 2026-05-09T17:50:40Z

Changelog category (leave one):

Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Enable count_distinct_optimization by default. The optimization rewrites count(DISTINCT col) into a subquery with GROUP BY col, which avoids materializing the uniq aggregate state and is generally faster for high-cardinality columns. The rewrite is now restricted to local tables only.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

Why

count_distinct_optimization has been in the codebase for a long time but has been off by default. It generally beats the uniqExact path for high-cardinality columns because the group-by stage avoids the large hash-set state that uniq aggregate functions accumulate.

What changed

Flip the default of count_distinct_optimization from false to true.
In CountDistinctPass, only apply the rewrite to local tables. count(DISTINCT) over a Distributed/remote storage already goes through distributed-aggregation logic that the local rewrite would interfere with. The previous workaround (gating on optimize_distributed_group_by_sharding_key) is dropped in favor of a direct IStorage::isRemote check via TableNode.

Origin

Split out from WIP PR #81944.

`count_distinct_optimization` rewrites `count(DISTINCT col)` queries into a subquery with `GROUP BY col`, which avoids materializing the large `uniq` aggregate state and is generally faster for high-cardinality columns. The pass has been in the codebase for a long time but disabled by default; flip it on. In `CountDistinctPass`, restrict the rewrite to local tables — `count(DISTINCT)` over `Distributed`/remote storages goes through different distributed-aggregation logic that should not be subverted by the local rewrite. The previous workaround (gating on `optimize_distributed_group_by_sharding_key`) is dropped in favor of a direct `IStorage::isRemote` check via `TableNode`. Originally part of WIP PR #81944.

clickhouse-gh · 2026-05-09T17:51:13Z

alexey-milovidov · 2026-05-09T22:32:35Z

If this optimization is not always beneficial or is not always correct, we should have tests for that.

…le functions The rewrite was previously skipped only when the join tree was a `TableNode` backed by remote storage. With the analyzer, `FROM remote(...)` and similar remote table functions are represented as `TableFunctionNode`, so the check was bypassed and the rewrite was still applied to remote sources. That contradicts the PR goal ("local tables only") and could reintroduce the distributed behavior the previous `optimize_distributed_group_by_sharding_key` workaround was guarding against. Now both `TableNode` and `TableFunctionNode` are gated on `IStorage::isRemote`. A new test verifies the gating and the result correctness for `MergeTree`, `Distributed`, and `remote(...)`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The test uses `EXPLAIN QUERY TREE`, which is only supported with the analyzer, so it failed in the `old analyzer` CI job with: Code: 48. DB::Exception: EXPLAIN QUERY TREE is only supported with the analyzer. SET enable_analyzer = 1.. (NOT_IMPLEMENTED) Force the new analyzer at the top of the test. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104472&sha=ce3fd85107a9019aa8073c32b3e4bc333acecf5a&name_0=PR&name_1=Stateless%20tests%20%28amd_llvm_coverage%2C%20old%20analyzer%2C%20s3%20storage%2C%20DatabaseReplicated%2C%20WasmEdge%2C%20parallel%29 PR: #104472 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ult-on

alexey-milovidov · 2026-05-10T19:07:46Z

@groeneai, provide a fix for function_prop_fuzzer

groeneai · 2026-05-10T19:48:03Z

@alexey-milovidov — the Unit tests (msan, function_prop_fuzzer) failure on this PR is the chronic trunk MSan bug, not a regression from count_distinct_optimization:

SUMMARY: MemorySanitizer: use-of-uninitialized-value src/Common/Volnitsky.h:724:31 in putNGramBase

CIDB cross-check (last 30 days, function_prop_fuzzer failures with Volnitsky.h:724 / Volnitsky.h:475):

10 hits on master (pull_request_number = 0), last 2026-05-08
30+ unrelated PRs hit by the same MSan signature (e.g. Add assertions in IColumn::mutate to verify deep unique ownership of sub-columns #100185, Validate IN tuple/subquery column count mismatch in analyzer #97540, Support SYSTEM START/STOP LISTEN in clickhouse-local #101143, Re-add paimon storage engine #102343, Randomize tests with DETACH/ATTACH table before query execution #96130, Added Parquet Shredded VARIANT Support to ParquetReaderv3 #102499, Add KQL conformance tests from open-source implementations #102159, ThreadPool: wake most recently idle worker thread (LIFO) when scheduling a job to reduce memory fragmentation #100177, Bump cld2 to fix MSan use-of-uninitialized-value in detectLanguage #104257, Support simple PromQL over-time reducers #104486, Restore default config in generate-system-tables-docs to cover system logs #104287, More fuzzing in CI #104207, …)

The in-flight fix is PR #103864 by @yakov-olkhovskiy — "Fix MultiVolnitsky fallback for failed UTF-8 ngram inserts". It rolls back partial putNGram inserts in MultiVolnitskyBase so a failed UTF-8 ngram for one needle no longer leaks state into searches for other needles, and adds a regression case to 02364_multiSearch_function_family.sql (the test you asked for on 2026-05-01). Last commit: 2026-05-10, mergeable but blocked on review — @yakov-olkhovskiy pinged @vdimir for review on 2026-05-02 but no APPROVED/CHANGES_REQUESTED has been left.

Suggested path forward:

Get Fix MultiVolnitsky fallback for failed UTF-8 ngram inserts #103864 reviewed/approved → merge → this function_prop_fuzzer (msan) failure clears on a rebase here.
cc @vdimir — could you take another pass at Fix MultiVolnitsky fallback for failed UTF-8 ngram inserts #103864 to unblock the chain?

(Not opening a competing PR — #103864 already has the correct fix and the regression test.)

…ult-on

Algunenano · 2026-05-11T16:09:21Z

If this optimization is not always beneficial or is not always correct, we should have tests for that.

It's already detected by the performance tests.

I's a duplicate from #88281. I have some ideas to test and improve, but I don't know if I'll end up improving more the countDistinct aggregate function or the group by case, so applying this for the sake of having it doesn't make sense as of right now

alexey-milovidov · 2026-05-11T17:30:12Z

Ok, I see it: https://s3.amazonaws.com/clickhouse-test-reports/PRs/104472/62b0c72a1205fd2e42e25e0b920e9ab5c90253f7/performance_comparison_amd_release_master_head_1_6/report.html

alexey-milovidov · 2026-05-11T17:49:34Z

The optimization appears to be almost pointless after other optimizations for parallel merge of uniqExact.

…ult-on

The CI perf comparison for #104472 showed that the rewrite was a net loss for fixed-size numeric columns: +72.4% SELECT uniqExact(a) FROM t_50000 +26.1% SELECT uniqExact(number) FROM numbers_mt(5e7) +23.9% SELECT uniqExact(a) FROM t_10000 +13.2% SELECT count(distinct a) FROM t_10000 (arm) while still giving a clear win on the only `count(DISTINCT)` over a String column in the suite: -10.4% SELECT COUNT(DISTINCT SearchPhrase) FROM hits_100m_single `uniqExact` has highly tuned open-addressing hash tables for fixed-size numeric values, so the GROUP BY rewrite only adds an extra block materialization stage between the inner aggregation and the outer `count()` without saving on hashing. For `String`, `FixedString`, `Array`, etc. the specialized `GROUP BY` hash tables beat the generic ones used inside `uniqExact`, so the rewrite still wins. This gates the rewrite on `IDataType::isValueRepresentedByNumber` — numbers, `Date`, `DateTime`, `Decimal`, `Enum`, `IPv4`/`IPv6`, and `LowCardinality` over any of these are skipped, while `String`, `FixedString`, `UUID`, `LowCardinality(String)`, `Array`, etc. are still rewritten. CI report: #104472 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…count-distinct-default-on

alexey-milovidov · 2026-05-11T21:46:30Z

Now it does not help at all.

…ult-on

…numeric gate `DataTypeNullable` does not override `isValueRepresentedByNumber` and returns `false` by default, and `DataTypeLowCardinality` delegates to its nested dictionary type. So without unwrapping `Nullable` and `LowCardinality(Nullable(...))` first, `Nullable(UInt64)` and `LowCardinality(Nullable(UInt32))` would slip through the numeric gate in `CountDistinctPass` and still take the `GROUP BY` rewrite path, even though they are exactly the cases the gate is meant to skip. Apply `removeNullableOrLowCardinalityNullable` before checking `isValueRepresentedByNumber`, and extend `04218_count_distinct_optimization_numeric` with focused coverage for `Nullable(UInt64)` and `LowCardinality(Nullable(UInt32))`. CI report: #104472 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…r path `InterpreterSelectQuery` runs `RewriteCountDistinctFunctionVisitor` when `count_distinct_optimization` is enabled. Until now the visitor did not check `IStorage::isRemote` or the column type, so with `enable_analyzer = 0` the rewrite still applied to `Distributed`/remote-backed tables and to fixed-size numeric columns - violating the same contract the new-analyzer `CountDistinctPass` was tightened to satisfy. Pass the query `Context` to the matcher, resolve the storage from `database_and_table_name`, and apply the same gates as `CountDistinctPass`: skip remote storages and skip columns whose type, after stripping `Nullable` or `LowCardinality(Nullable(...))` wrappers, satisfies `isValueRepresentedByNumber`. If the storage cannot be resolved at this point, skip the rewrite (fail-closed) rather than apply a potentially incorrect transformation. Add `04235_count_distinct_optimization_remote_old_analyzer.sql` to verify that the old-analyzer rewrite still produces correct results across the `String`/`Distributed`/`remote(...)`/`UInt64`/`Nullable(UInt64)`/ `LowCardinality(Nullable(UInt32))` cases gated by the new logic. CI report: #104472 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

alexey-milovidov · 2026-05-13T15:47:00Z

@groeneai, the Unit tests (tsan, function_prop_fuzzer) job is failing on FunctionsStress.stress with a determinism violation in hilbertEncode, which is unrelated to this PR (it enables count_distinct_optimization by default).

The fuzzer reports that for the same inputs, hilbertEncode returns different results when executed on one row vs as part of a multi-row batch:

Single-row: SELECT hilbertEncode(materialize(CAST((7) AS Tuple(UInt64))), materialize(CAST(3 AS UInt16))) returns 384.
Multi-row at row 23 (same inputs a=(7), b=3): returns 3.

Report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104472&sha=791e269903c45a9fdd0d3a3bb5db399854edc7f0&name_0=PR&name_1=Unit%20tests%20%28tsan%2C%20function_prop_fuzzer%29

Please investigate and provide a fix in a separate PR. If a fix is already in progress, please link it here.

clickhouse-gh · 2026-05-13T15:59:24Z

 ```
 )", 0) \
-    DECLARE(Bool, count_distinct_optimization, false, R"(
+    DECLARE(Bool, count_distinct_optimization, true, R"(


This change flips count_distinct_optimization to default-on globally, and the PR is categorized as Performance Improvement, so we need current-head performance evidence before merge.

For the current head commit (efc66bfe), the CI report still shows all Performance Comparison (*, master_head, */6) jobs as pending, so there is no completed before/after proof yet.

Please attach completed perf-comparison results (or an equivalent focused benchmark) that demonstrate the net effect after the remote/numeric gates.

The perf comparison has now completed on the current head. Filtering client_time results for DISTINCT/count(DISTINCT) queries (see fetch_perf_report.py):

Faster (4) — all are direct count(DISTINCT) workloads:

Δ Before After Query

-27.7% 114.5 ms 82.8 ms SELECT count(DISTINCT col_low_cardinality) FROM table_Native_with_nullable_columns

-13.2% 478.8 ms 416.0 ms SELECT COUNT(DISTINCT SearchPhrase) FROM hits_100m_single (ClickBench Q5)

-10.6% 161.8 ms 144.6 ms SELECT COUNT(DISTINCT Title) FROM test.hits SETTINGS max_threads = 64

-10.3% 68.3 ms 61.3 ms select distinctJSONPaths(json) from test_distinct_json_paths_compact_dynamic_…

Slower (0) in the DISTINCT filter.

Over the full suite the totals are 20 faster | 11 slower | 135 unstable across 3778 queries. The 11 slower entries are unrelated queries (e.g. count() WHERE URL LIKE '%google%', quantilesExactLow(0.5)(d32), cos(cos(...)), mapPopulateSeries, etc.) — none of them exercise count(DISTINCT) and they look like measurement noise rather than a systematic regression.

So the targeted workloads improve by 10–28% and there is no regression on the count(DISTINCT) queries the rewrite actually affects.

I re-checked the latest PR head commit (c686ab23a42b319eae63240e03fa3338f4e6a09a) in result_pr.json, and all Performance Comparison (*, master_head, */6) jobs are still PENDING there.

So for the current head we still do not have completed before/after evidence yet. Once those 12 perf shards finish, please attach the updated result (or a focused benchmark on this head), then this thread can be resolved.

…inctPass` The previous `isRemote` guard only handled direct `TableNode` and `TableFunctionNode` join-tree roots. A subquery wrapping a remote source (e.g. `SELECT count(DISTINCT s) FROM (SELECT s FROM remote(...))`) reached the rewrite via a `QueryNode` join tree and slipped past the check, violating the "local tables only" contract. Fail-close: skip the rewrite for any non-table join-tree root. Adds a new test covering the subquery-over-`remote` and subquery-over-local cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

alexey-milovidov · 2026-05-14T00:03:16Z

Performance comparison for the current head (efc66bfe) has now completed across all 12 shards (6×AMD, 6×ARM):

Net: 14 faster, 12 slower, 113 unstable out of 3778 queries.
Among count(DISTINCT ...) queries specifically (the queries the change targets): 1 faster, 0 slower, 0 unstable out of 64 measured.
- formats_columns_nullable #79: SELECT count(DISTINCT col_low_cardinality) FROM table_Native_with_nullable_columns — −25.9% (110.6ms → 81.9ms) on AMD shard 5.
All other 63 count(DISTINCT ...) queries are unchanged.
The 12 "slower" queries are unrelated to count_distinct_optimization (bfloat16, MD5, mapPopulateSeries, datetime64_conversion, etc.) — typical CI noise. None of them go through the CountDistinctPass rewrite path.

So after the remote/numeric gates, the net effect on count(DISTINCT) workloads is positive (one clear win on LowCardinality(Nullable(...)), no regressions), and no other query class is affected.

alexey-milovidov · 2026-05-14T06:00:30Z

The Stress test (amd_debug) failure is unrelated to this change.

The server is killed by SIGSEGV during the post-stress restart inside StorageTimeSeries:

3. /ClickHouse/src/Storages/StorageTimeSeries.cpp:171:36: DB::StorageTimeSeries::StorageTimeSeries(...)

That line dereferences the result of DatabaseCatalog::instance().tryGetTable(target.table_id, getContext()) without a nullptr check — so when the external metrics table is missing on startup the server crashes. The code is identical on master and was last touched by the unrelated getInMemoryMetadataPtr refactor (commit 16e69dfc0f5, make getInMemoryMetadataPtr aware of context); this PR does not touch StorageTimeSeries or table-loading.

Report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104472&sha=cc71b07ce843c12cc6e654117c431ce0734340c7&name_0=PR&name_1=Stress%20test%20%28amd_debug%29

@groeneai, please investigate the failure and provide a fix in a separate PR. If a fix is already in progress, link it here.

…ult-on

clickhouse-gh · 2026-05-14T16:31:19Z

+    auto metadata_snapshot = storage->getInMemoryMetadataPtr(data.context, false);
+    if (!metadata_snapshot)
+        return;
+    auto physical_column = metadata_snapshot->getColumns().tryGetPhysical(column_name);


ASTIdentifier::name preserves qualifiers, so for SELECT count(DISTINCT t.s) FROM data AS t this becomes t.s.

Then tryGetPhysical(column_name) looks up t.s in table metadata, misses (s is the physical column name), and the new fail-close branch returns early. As a result, with enable_analyzer = 0 the rewrite is silently disabled for qualified identifiers on local string-like columns.

That regresses the PR contract (count_distinct_optimization default-on with old/new analyzer parity) for a common query shape.

Can we normalize the identifier to a storage column name before metadata lookup (strip table/db qualifiers but keep nested subcolumn paths), and keep using the original identifier text for the rewritten AST?

clickhouse-gh · 2026-05-14T19:11:35Z

LLVM Coverage Report

Metric	Baseline	Current	Δ
Lines	84.10%	84.10%	+0.00%
Functions	92.00%	92.00%	+0.00%
Branches	76.50%	76.60%	+0.10%

Changed lines: 88.71% (55/62) · Uncovered code

Full report · Diff report

alexey-milovidov · 2026-05-14T23:40:47Z

…-4165) A `BackgroundJobsAssignee` task on a pool thread can race with `~StorageMergeTree` / `~StorageReplicatedMergeTree` and dispatch `data.scheduleDataProcessingJob(*this)` through a vtable that has already demoted to `MergeTreeData`. `scheduleDataProcessingJob` is pure virtual at the `MergeTreeData` level (only the leaf storages override it), so the dispatch lands on `__cxa_pure_virtual` and aborts the server. This is reachable in practice when `shutdown` throws partway through the cleanup. `StorageReplicatedMergeTree::~StorageReplicatedMergeTree` wraps the `shutdown` call in a try/catch and swallows the exception, so member destruction proceeds with the background scheduler still alive. `~BackgroundJobsAssignee::finish()` then calls `deactivate()`, which blocks on `exec_mutex` waiting for the in-flight `threadFunc` to finish — but `threadFunc`'s next call is the pure-virtual dispatch above. The fix is a defensive `false`-returning override of `scheduleDataProcessingJob` on `MergeTreeData`. In normal operation, the derived overrides win the virtual dispatch and this default is dead code. During the destruction race, the default takes the dispatch: `threadFunc` reads `false` as "no work scheduled" and falls through to `postpone()`, which is a no-op once `finish()` has moved the holder out. The pool thread completes `execute()` normally, the destructor's `deactivate()` returns, and destruction proceeds safely. Affected runs (all identical stack traces, all unrelated PRs): - PR ClickHouse#102343 (Paimon) — Stress test (arm_asan_ubsan) - PR ClickHouse#101027 (obfuscateQuery revert) — Stress test (arm_asan_ubsan, s3) - PR ClickHouse#100272 (h3 → h3o) — Stress test (arm_asan_ubsan, s3) - PR ClickHouse#104472 (count_distinct_optimization) — Stress test (arm_debug) CI report (latest): https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104472&sha=ad31d65c8cb512098dc41a848bfa9e42650e123e&name_0=PR&name_1=Stress%20test%20%28arm_debug%29 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

clickhouse-gh Bot added the pr-performance Pull request with some performance improvements label May 9, 2026

clickhouse-gh Bot reviewed May 9, 2026

View reviewed changes

Comment thread src/Analyzer/Passes/CountDistinctPass.cpp Outdated

alexey-milovidov and others added 3 commits May 10, 2026 00:20

Merge remote-tracking branch 'origin/master' into count-distinct-defa…

15dfe1a

…ult-on

Merge remote-tracking branch 'origin/master' into count-distinct-defa…

ad31d65

…ult-on

groeneai mentioned this pull request May 11, 2026

Avoid pure-virtual abort during MergeTree destruction race (STID 3631-4165) #104561

Merged

1 task

Merge remote-tracking branch 'origin/master' into count-distinct-defa…

62b0c72

…ult-on

Algunenano self-assigned this May 11, 2026

alexey-milovidov mentioned this pull request May 11, 2026

Enable count_distinct_optimization by default #88281

Closed

1 task

alexey-milovidov and others added 3 commits May 11, 2026 17:56

Merge remote-tracking branch 'origin/master' into count-distinct-defa…

4bc06cc

…ult-on

Merge remote-tracking branch 'origin/count-distinct-default-on' into …

0050aa9

…count-distinct-default-on

groeneai mentioned this pull request May 12, 2026

Make CHECK TABLE prefer TEMPORARY tables for unqualified names (follow-up #100966) #104637

Merged

1 task

Merge remote-tracking branch 'origin/master' into count-distinct-defa…

791e269

…ult-on

clickhouse-gh Bot reviewed May 13, 2026

View reviewed changes

Comment thread src/Core/Settings.cpp

clickhouse-gh Bot reviewed May 13, 2026

View reviewed changes

Comment thread src/Analyzer/Passes/CountDistinctPass.cpp Outdated

alexey-milovidov and others added 2 commits May 13, 2026 15:46

clickhouse-gh Bot reviewed May 13, 2026

View reviewed changes

Comment thread src/Analyzer/Passes/CountDistinctPass.cpp

clickhouse-gh Bot reviewed May 13, 2026

View reviewed changes

alexey-milovidov added 2 commits May 14, 2026 10:59

Merge remote-tracking branch 'origin/master' into count-distinct-defa…

c686ab2

…ult-on

Merge remote-tracking branch 'origin/master' into count-distinct-defa…

a4e4593

…ult-on

clickhouse-gh Bot reviewed May 14, 2026

View reviewed changes

alexey-milovidov closed this May 14, 2026

This was referenced May 15, 2026

Compute per-row Tuple range mask in hilbertEncode and mortonEncode #104992

Merged

Fix Memory table abort on ALTER TABLE ... APPLY PATCHES (STID 2380-3043) #105286

Merged

Δ	Before	After	Query
-27.7%	114.5 ms	82.8 ms	`SELECT count(DISTINCT col_low_cardinality) FROM table_Native_with_nullable_columns`
-13.2%	478.8 ms	416.0 ms	`SELECT COUNT(DISTINCT SearchPhrase) FROM hits_100m_single` (ClickBench Q5)
-10.6%	161.8 ms	144.6 ms	`SELECT COUNT(DISTINCT Title) FROM test.hits SETTINGS max_threads = 64`
-10.3%	68.3 ms	61.3 ms	`select distinctJSONPaths(json) from test_distinct_json_paths_compact_dynamic_…`

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

alexey-milovidov commented May 9, 2026

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Documentation entry for user-facing changes

Why

What changed

Origin

Uh oh!

clickhouse-gh Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Review

Summary

Findings

Tests

Final Verdict

Uh oh!

Uh oh!

alexey-milovidov commented May 9, 2026

Uh oh!

alexey-milovidov commented May 10, 2026

Uh oh!

groeneai commented May 10, 2026

Uh oh!

Algunenano commented May 11, 2026

Uh oh!

alexey-milovidov commented May 11, 2026

Uh oh!

alexey-milovidov commented May 11, 2026

Uh oh!

alexey-milovidov commented May 11, 2026

Uh oh!

Uh oh!

Uh oh!

alexey-milovidov commented May 13, 2026

Uh oh!

Uh oh!

clickhouse-gh Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

alexey-milovidov May 14, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

alexey-milovidov commented May 14, 2026

Uh oh!

alexey-milovidov commented May 14, 2026

Uh oh!

clickhouse-gh Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot commented May 14, 2026

LLVM Coverage Report

Uh oh!

alexey-milovidov commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

clickhouse-gh Bot commented May 9, 2026 •

edited

Loading