Enable use_top_k_dynamic_filtering and use_skip_indexes_for_top_k by default#99537
Conversation
… by default Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
I will check the failures. |
|
@shankar-iyer FYI, it is #91782 |
|
@alexey-milovidov Let's enable only |
I see it as a solid use-case. And indices by default will not always be enabled; there will be plenty of use cases without them. Why to afraid? |
Best case - But this one - If there is not many rows to sort after filters, we have additional overhead of reading a 4-byte/8-byte column (Numeric/Date). What do you say - enable it and real world workloads should balance out? All the test failures are not due to any issue in the optimization, but due to change in rows_read profile event, which I can address. |
Yes, I think it will be beneficial on average! |
|
Working on that in #91782 |
Disable `use_top_k_dynamic_filtering` and `use_skip_indexes_for_top_k` in the test that checks read_rows for ORDER BY LIMIT, since these optimizations change the reading strategy and are not what this test validates. Restore the accidentally removed empty 26.4 version placeholders in SettingsChangesHistory. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…o enable-top-k-optimizations-by-default
…ickHouse/ClickHouse into enable-top-k-optimizations-by-default
…zations-by-default
1c18b6e to
7b20276
Compare
|
As #103553 is merged, let's update the branch here. |
…zations-by-default
|
@groeneai, fix |
|
@alexey-milovidov — CIDB cross-PR (30 days, STID 4870 family): 43 master hits + 30+ unrelated PRs. The hit on this PR is STID 4870-4f21 ( SELECT timeSeriesResampleToGridWithStalenessArray(2, 120, toNullable(10), 10)(timestamps, `values`) FROM (...)Error: Root cause: Fix in flight: PR #103428 by @vitlibar — "Resubmit: Preserve parameter types in timeseries aggregate functions". OPEN + mergeable, awaiting review from @tavplubix since 2026-04-23. My earlier PR #103079 (fixing it from the Array-combinator side: have the wrapper inherit the nested function's normalized params) was closed by @vitlibar on 2026-04-23 in favor of his approach (preserve original parameter types in TimeSeries instead of normalizing). Not filing a duplicate — would conflict with the active work. Action: This failure on #99537 will resolve once #103428 merges. The bottleneck is review on #103428. Tracking under our internal P0 task since 2026-04-17. Session: cron:clickhouse-ci-task-worker:20260506-024500 |
…zations-by-default
|
Let me check. |
The `SettingsChangesHistory` entry for
`use_top_k_dynamic_filtering_for_variable_length_types` previously
recorded `{false, false}`, which means `compatibility` for older
versions would still leave variable-length types out of the dynamic
filter. In older versions, when a user opted into
`use_top_k_dynamic_filtering = 1` (or once PR #99537 turns it on by
default), variable-length sort columns received the dynamic filter
unconditionally — i.e. the effective value of the new gate was `true`.
Change the entry to `{true, false}` so `compatibility` correctly
restores the previous behavior.
Flagged by AI Review on PR #104216.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…zations-by-default
PR ClickHouse#99537 turns on use_top_k_dynamic_filtering by default. For sort columns of fixed-length types (numbers, dates, etc.) this is a clear win: the per-row __topKFilter prewhere lets minmax/threshold-based mark skipping kick in early. For variable-length types (String, Array, Map, Tuple containing such), the per-row threshold comparison runs the generic less-or-equals over the column with no granule pruning, so when the lex-min value of the column dominates (mostly empty strings, e.g. MobilePhoneModel where 94/100M rows are "") the prewhere costs CPU without saving any I/O. Locally reproducing the regression flagged in the perf comparison for PR ClickHouse#99537, batch 3/6 amd_release: SELECT MobilePhoneModel FROM hits_100m_single ORDER BY MobilePhoneModel LIMIT 300 master default (both off): 0.013-0.015s PR ClickHouse#99537 default (both on): 0.022-0.026s 1.47-1.63x slower Profile events show identical SelectedRows (99,997,497) between the two — the dynamic filter skips zero granules but still pays a per-block lessOrEquals(String, String) cost, increasing UserTime by ~1.8x. The fix introduces use_top_k_dynamic_filtering_for_variable_length_types (off by default). When use_top_k_dynamic_filtering is enabled, the optimization is now restricted to columns whose values have a fixed maximum size in memory — i.e. haveMaximumSizeOfValue() returns true. That predicate forwards through Nullable and Tuple, so Nullable of a fixed-length type and Tuples of fixed-length types remain eligible, matching the user's request. Variable-length types only get the optimization when the user explicitly opts in. Test queries follow MobilePhoneModel's distribution and check the EXPLAIN output for the __topKFilter prewhere with each gate combination. End-to-end timing on the same data confirms the regression is gone with the new default settings: PR ClickHouse#99537 settings on, new gate off (default): 0.017s All on (old broken behavior): 0.026s master baseline (top-k off): 0.013s CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=99537&sha=9d86b8cf27ca90c5223f949656f0141685ea3f6f&name_0=PR&name_1=Performance%20Comparison%20%28amd_release%2C%20master_head%2C%203%2F6%29 Original PR: ClickHouse#99537 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Remove unrelated changes in vectorsearchstress.yml. Update the branch to include the new changes that should fix the performance. |
Restore the file to its master state — the runner change to `arm-large` was unrelated to enabling top-k optimizations and should not be in this PR. Per review comment: #99537 (comment)
…zations-by-default
…zations-by-default
LLVM Coverage Report
Changed lines: 100.00% (27/27) · Uncovered code |
Two failures showed up only on the internal CI flavours: 1. `test_distributed_stateless::test_simple_query` failed with `Unknown function __topKFilter` once `use_top_k_dynamic_filtering` became the default. The dynamic-filtering path injects an internal `__topKFilter` that is created on demand with a runtime threshold tracker and is not registered in `FunctionFactory`; the skip-index-on-data-read path likewise relies on a `TopKThresholdTracker` shared between `SortingStep` and `ReadFromMergeTree`. Neither survives serialization to remote workers, so when `make_distributed_plan = 1` the optimizer must skip the rewrite. Thread `make_distributed_plan` through `Optimization::ExtraSettings` and bail out early in `tryOptimizeTopK`. 2. `01926_order_by_desc_limit` timed out in the `amd_asan_ubsan, flaky check, s3 storage, meta in keeper` flavour: the flaky check runs the test many times in parallel and each run takes >240 s because of `INSERT` + `OPTIMIZE FINAL` over remote storage under sanitizers. The only modification from the previous revision is two `SET` lines that are no-ops for this query (sort column is the primary key with `optimize_read_in_order`, no skip index defined), so re-running it adds no signal. Add the `no-flaky-check` tag with a comment explaining why. CI: Upstream PR: ClickHouse#99537 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>






Summary
use_top_k_dynamic_filteringanduse_skip_indexes_for_top_ksettings by defaultChangelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Enable
use_top_k_dynamic_filteringanduse_skip_indexes_for_top_ksettings by default to improve performance ofORDER BY ... LIMIT Nqueries.Documentation entry for user-facing changes
No documentation changes needed — the settings and their documentation already exist; only the default values changed.
Version info
26.5.1.436