Enable `use_top_k_dynamic_filtering` and `use_skip_indexes_for_top_k` by default by alexey-milovidov · Pull Request #99537 · ClickHouse/ClickHouse · GitHub
Skip to content

Enable use_top_k_dynamic_filtering and use_skip_indexes_for_top_k by default#99537

Merged
alexey-milovidov merged 49 commits into
masterfrom
enable-top-k-optimizations-by-default
May 8, 2026
Merged

Enable use_top_k_dynamic_filtering and use_skip_indexes_for_top_k by default#99537
alexey-milovidov merged 49 commits into
masterfrom
enable-top-k-optimizations-by-default

Conversation

@alexey-milovidov

@alexey-milovidov alexey-milovidov commented Mar 16, 2026

Copy link
Copy Markdown
Member

Summary

  • Enable use_top_k_dynamic_filtering and use_skip_indexes_for_top_k settings by default
  • Add corresponding entries in settings changes history for version 26.3

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Enable use_top_k_dynamic_filtering and use_skip_indexes_for_top_k settings by default to improve performance of ORDER BY ... LIMIT N queries.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

No documentation changes needed — the settings and their documentation already exist; only the default values changed.

Version info

  • Merged into: 26.5.1.436

… by default

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@clickhouse-gh

clickhouse-gh Bot commented Mar 16, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-performance Pull request with some performance improvements label Mar 16, 2026
@shankar-iyer shankar-iyer self-assigned this Mar 16, 2026
@shankar-iyer

Copy link
Copy Markdown
Member

I will check the failures.

@alexey-milovidov

Copy link
Copy Markdown
Member Author

@shankar-iyer FYI, it is #91782

@shankar-iyer

Copy link
Copy Markdown
Member

@alexey-milovidov Let's enable only use_skip_indexes_for_top_k now? That aligns with our overall plan of skip indexes by default. use_top_k_dynamic_filtering is good when skip index is not present on the ORDER BY column, and provides only marginal improvement if skip index was present and used by use_skip_indexes_for_top_k.

@alexey-milovidov

Copy link
Copy Markdown
Member Author

use_top_k_dynamic_filtering is good when skip index is not present

I see it as a solid use-case. And indices by default will not always be enabled; there will be plenty of use cases without them. Why to afraid?

@shankar-iyer

shankar-iyer commented Mar 18, 2026

Copy link
Copy Markdown
Member

Why to afraid?

@alexey-milovidov

Best case -

SELECT URL, EventTime FROM hits WHERE URL LIKE '%google%' ORDER BY EventTime DESC LIMIT 10
  SETTINGS use_top_k_dynamic_filtering=0

10 rows in set. Elapsed: 3.374 sec. Processed 100.00 million rows, 10.02 GB (29.64 million rows/s., 2.97 GB/s.)


SELECT URL, EventTime FROM hits WHERE URL LIKE '%google%' ORDER BY EventTime DESC LIMIT 10
  SETTINGS use_top_k_dynamic_filtering=1

10 rows in set. Elapsed: 0.526 sec. Processed 100.00 million rows, 1.20 GB (189.97 million rows/s., 2.28 GB/s.)

But this one -


SELECT URL, EventTime FROM hits WHERE URL LIKE '%shankar%' ORDER BY EventTime DESC LIMIT 10
  SETTINGS use_top_k_dynamic_filtering=0

0 rows in set. Elapsed: 3.154 sec. Processed 100.00 million rows, 9.90 GB (31.70 million rows/s., 3.14 GB/s.)


SELECT URL, EventTime FROM hits WHERE URL LIKE '%shankar%' ORDER BY EventTime DESC LIMIT 10
  SETTINGS use_top_k_dynamic_filtering=0

0 rows in set. Elapsed: 3.507 sec. Processed 100.00 million rows, 10.30 GB (28.52 million rows/s., 2.94 GB/s.)

If there is not many rows to sort after filters, we have additional overhead of reading a 4-byte/8-byte column (Numeric/Date). What do you say - enable it and real world workloads should balance out?

All the test failures are not due to any issue in the optimization, but due to change in rows_read profile event, which I can address.

@alexey-milovidov

Copy link
Copy Markdown
Member Author

What do you say - enable it and real world workloads should balance out?

Yes, I think it will be beneficial on average!

@alexey-milovidov

Copy link
Copy Markdown
Member Author

Working on that in #91782

Disable `use_top_k_dynamic_filtering` and `use_skip_indexes_for_top_k` in the
test that checks read_rows for ORDER BY LIMIT, since these optimizations change
the reading strategy and are not what this test validates.

Restore the accidentally removed empty 26.4 version placeholders in
SettingsChangesHistory.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread src/Core/SettingsChangesHistory.cpp Outdated
@CLAassistant

CLAassistant commented Mar 26, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@alexey-milovidov alexey-milovidov force-pushed the enable-top-k-optimizations-by-default branch from 1c18b6e to 7b20276 Compare March 26, 2026 23:47
@alexey-milovidov

Copy link
Copy Markdown
Member Author

As #103553 is merged, let's update the branch here.

@alexey-milovidov

Copy link
Copy Markdown
Member Author

@groeneai, fix Logical error: AggregateFunctionArray and send a separate PR with the fix.

@groeneai

groeneai commented May 6, 2026

Copy link
Copy Markdown
Contributor

@alexey-milovidovLogical error: AggregateFunctionArray on this PR is the chronic trunk bug STID 4870 family (parameters mismatch between Array wrapper and nested function for timeSeries*ToGridArray). It is not caused by this PR.

CIDB cross-PR (30 days, STID 4870 family): 43 master hits + 30+ unrelated PRs. The hit on this PR is STID 4870-4f21 (AST fuzzer (amd_debug) 2026-05-05 20:14 UTC), failing query:

SELECT timeSeriesResampleToGridWithStalenessArray(2, 120, toNullable(10), 10)(timestamps, `values`) FROM (...)

Error:

AggregateFunctionArray: parameters mismatch between Array wrapper 'timeSeriesResampleToGridWithStalenessArray' and nested function 'timeSeriesResampleToGridWithStaleness'.
Wrapper has 4 parameter(s): [UInt64_2, UInt64_120, UInt64_10, UInt64_10]
Nested function has 4 parameter(s): [Decimal64_'2', Decimal64_'120', Decimal64_'10', Decimal64_'10']

Root cause: AggregateFunctionTimeseriesBase normalizes its 4 timestamp parameters to Decimal64, but the Array combinator copies the user-typed UInt64 params. The chassert wrapper.params == nested.getParameters() was promoted to LOGICAL_ERROR in #102868, exposing the family.

Fix in flight: PR #103428 by @vitlibar"Resubmit: Preserve parameter types in timeseries aggregate functions". OPEN + mergeable, awaiting review from @tavplubix since 2026-04-23.

My earlier PR #103079 (fixing it from the Array-combinator side: have the wrapper inherit the nested function's normalized params) was closed by @vitlibar on 2026-04-23 in favor of his approach (preserve original parameter types in TimeSeries instead of normalizing). Not filing a duplicate — would conflict with the active work.

Action: This failure on #99537 will resolve once #103428 merges. The bottleneck is review on #103428. Tracking under our internal P0 task since 2026-04-17.

Session: cron:clickhouse-ci-task-worker:20260506-024500

@alexey-milovidov

Copy link
Copy Markdown
Member Author
  Genuine regressions caused by the PR (changed, never seen on master in last 30d)                                                                                                                             
                                                                  
  ┌──────────┬───────────────────┬──────────────────────────────────────────────────┬─────────────────────────────┐                                                                                            
  │ Slowdown │       Test        │                      Query                       │ Master 30d (slower / total) │
  ├──────────┼───────────────────┼──────────────────────────────────────────────────┼─────────────────────────────┤                                                                                            
  │ 1.63x    │ string_sort #10   │ ORDER BY MobilePhoneModel LIMIT 300              │ 0 / 35 — new                │
  ├──────────┼───────────────────┼──────────────────────────────────────────────────┼─────────────────────────────┤                                                                                            
  │ 1.38x    │ sparse_column #52 │ test_sparse_1000 ORDER BY u8, u64 DESC LIMIT 100 │ 0 / 49 — new                │                                                                                            
  ├──────────┼───────────────────┼──────────────────────────────────────────────────┼─────────────────────────────┤                                                                                            
  │ 1.35x    │ sparse_column #47 │ test_full_1000 ORDER BY u8, u64 DESC LIMIT 100   │ 0 / 45 — new                │                                                                                            
  ├──────────┼───────────────────┼──────────────────────────────────────────────────┼─────────────────────────────┤                                                                                            
  │ 1.34x    │ sparse_column #49 │ test_full_10 ORDER BY u8, u64 DESC LIMIT 100     │ 0 / 58 — new                │
  ├──────────┼───────────────────┼──────────────────────────────────────────────────┼─────────────────────────────┤                                                                                            
  │ 1.20x    │ string_sort #117  │ ORDER BY MobilePhoneModel, SearchPhrase LIMIT 10 │ 0 / 30 — new                │
  ├──────────┼───────────────────┼──────────────────────────────────────────────────┼─────────────────────────────┤                                                                                            
  │ 1.16x    │ string_sort #108  │ ORDER BY SearchPhrase, URL LIMIT 10              │ 0 / 24 — new                │
  └──────────┴───────────────────┴──────────────────────────────────────────────────┴─────────────────────────────┘

@shankar-iyer

Copy link
Copy Markdown
Member
  Genuine regressions caused by the PR (changed, never seen on master in last 30d)                                                                                                                             
                                                                  
  ┌──────────┬───────────────────┬──────────────────────────────────────────────────┬─────────────────────────────┐                                                                                            
  │ Slowdown │       Test        │                      Query                       │ Master 30d (slower / total) │
  ├──────────┼───────────────────┼──────────────────────────────────────────────────┼─────────────────────────────┤                                                                                            
  │ 1.63x    │ string_sort #10   │ ORDER BY MobilePhoneModel LIMIT 300              │ 0 / 35 — new                │
  ├──────────┼───────────────────┼──────────────────────────────────────────────────┼─────────────────────────────┤                                                                                            
  │ 1.38x    │ sparse_column #52 │ test_sparse_1000 ORDER BY u8, u64 DESC LIMIT 100 │ 0 / 49 — new                │                                                                                            
  ├──────────┼───────────────────┼──────────────────────────────────────────────────┼─────────────────────────────┤                                                                                            
  │ 1.35x    │ sparse_column #47 │ test_full_1000 ORDER BY u8, u64 DESC LIMIT 100   │ 0 / 45 — new                │                                                                                            
  ├──────────┼───────────────────┼──────────────────────────────────────────────────┼─────────────────────────────┤                                                                                            
  │ 1.34x    │ sparse_column #49 │ test_full_10 ORDER BY u8, u64 DESC LIMIT 100     │ 0 / 58 — new                │
  ├──────────┼───────────────────┼──────────────────────────────────────────────────┼─────────────────────────────┤                                                                                            
  │ 1.20x    │ string_sort #117  │ ORDER BY MobilePhoneModel, SearchPhrase LIMIT 10 │ 0 / 30 — new                │
  ├──────────┼───────────────────┼──────────────────────────────────────────────────┼─────────────────────────────┤                                                                                            
  │ 1.16x    │ string_sort #108  │ ORDER BY SearchPhrase, URL LIMIT 10              │ 0 / 24 — new                │
  └──────────┴───────────────────┴──────────────────────────────────────────────────┴─────────────────────────────┘

Let me check.

alexey-milovidov added a commit that referenced this pull request May 6, 2026
The `SettingsChangesHistory` entry for
`use_top_k_dynamic_filtering_for_variable_length_types` previously
recorded `{false, false}`, which means `compatibility` for older
versions would still leave variable-length types out of the dynamic
filter. In older versions, when a user opted into
`use_top_k_dynamic_filtering = 1` (or once PR #99537 turns it on by
default), variable-length sort columns received the dynamic filter
unconditionally — i.e. the effective value of the new gate was `true`.

Change the entry to `{true, false}` so `compatibility` correctly
restores the previous behavior.

Flagged by AI Review on PR #104216.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
nihalzp pushed a commit to nihalzp/ClickHouse that referenced this pull request May 7, 2026
PR ClickHouse#99537 turns on use_top_k_dynamic_filtering by default. For sort
columns of fixed-length types (numbers, dates, etc.) this is a clear
win: the per-row __topKFilter prewhere lets minmax/threshold-based
mark skipping kick in early. For variable-length types (String, Array,
Map, Tuple containing such), the per-row threshold comparison runs the
generic less-or-equals over the column with no granule pruning, so
when the lex-min value of the column dominates (mostly empty strings,
e.g. MobilePhoneModel where 94/100M rows are "") the prewhere costs
CPU without saving any I/O.

Locally reproducing the regression flagged in the perf comparison for
PR ClickHouse#99537, batch 3/6 amd_release:

  SELECT MobilePhoneModel FROM hits_100m_single
    ORDER BY MobilePhoneModel LIMIT 300

  master default (both off): 0.013-0.015s
  PR ClickHouse#99537 default  (both on): 0.022-0.026s   1.47-1.63x slower

Profile events show identical SelectedRows (99,997,497) between the
two — the dynamic filter skips zero granules but still pays a
per-block lessOrEquals(String, String) cost, increasing UserTime by
~1.8x.

The fix introduces use_top_k_dynamic_filtering_for_variable_length_types
(off by default). When use_top_k_dynamic_filtering is enabled, the
optimization is now restricted to columns whose values have a fixed
maximum size in memory — i.e. haveMaximumSizeOfValue() returns true.
That predicate forwards through Nullable and Tuple, so Nullable of a
fixed-length type and Tuples of fixed-length types remain eligible,
matching the user's request. Variable-length types only get the
optimization when the user explicitly opts in.

Test queries follow MobilePhoneModel's distribution and check the
EXPLAIN output for the __topKFilter prewhere with each gate combination.
End-to-end timing on the same data confirms the regression is gone with
the new default settings:

  PR ClickHouse#99537 settings on, new gate off (default): 0.017s
  All on (old broken behavior):                  0.026s
  master baseline (top-k off):                   0.013s

CI report:
https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=99537&sha=9d86b8cf27ca90c5223f949656f0141685ea3f6f&name_0=PR&name_1=Performance%20Comparison%20%28amd_release%2C%20master_head%2C%203%2F6%29
Original PR: ClickHouse#99537

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Remove unrelated changes in vectorsearchstress.yml.

Update the branch to include the new changes that should fix the performance.

Restore the file to its master state — the runner change to `arm-large`
was unrelated to enabling top-k optimizations and should not be in this
PR.

Per review comment:
#99537 (comment)
@clickhouse-gh

clickhouse-gh Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.20% +0.10%
Functions 91.10% 91.10% +0.00%
Branches 76.60% 76.70% +0.10%

Changed lines: 100.00% (27/27) · Uncovered code

Full report · Diff report

@alexey-milovidov alexey-milovidov added this pull request to the merge queue May 8, 2026
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Merged via the queue into master with commit b1037f3 May 8, 2026
165 checks passed
@alexey-milovidov alexey-milovidov deleted the enable-top-k-optimizations-by-default branch May 8, 2026 11:56
@robot-ch-test-poll robot-ch-test-poll added the pr-synced-to-cloud The PR is synced to the cloud repo label May 8, 2026
pull Bot pushed a commit to admariner/ClickHouse that referenced this pull request Jun 9, 2026
Two failures showed up only on the internal CI flavours:

1. `test_distributed_stateless::test_simple_query` failed with
 `Unknown function __topKFilter` once `use_top_k_dynamic_filtering`
 became the default. The dynamic-filtering path injects an internal
 `__topKFilter` that is created on demand with a runtime threshold
 tracker and is not registered in `FunctionFactory`; the
 skip-index-on-data-read path likewise relies on a
 `TopKThresholdTracker` shared between `SortingStep` and
 `ReadFromMergeTree`. Neither survives serialization to remote
 workers, so when `make_distributed_plan = 1` the optimizer must skip
 the rewrite. Thread `make_distributed_plan` through
 `Optimization::ExtraSettings` and bail out early in
 `tryOptimizeTopK`.

2. `01926_order_by_desc_limit` timed out in the
 `amd_asan_ubsan, flaky check, s3 storage, meta in keeper` flavour:
 the flaky check runs the test many times in parallel and each run
 takes >240 s because of `INSERT` + `OPTIMIZE FINAL` over remote
 storage under sanitizers. The only modification from the previous
 revision is two `SET` lines that are no-ops for this query
 (sort column is the primary key with `optimize_read_in_order`,
 no skip index defined), so re-running it adds no signal. Add the
 `no-flaky-check` tag with a comment explaining why.

CI:
Upstream PR: ClickHouse#99537

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-performance Pull request with some performance improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants