Speed up `DISTINCT` in order, sort-merge joins, `LIMIT BY` and negative `LIMIT BY` by nihalzp · Pull Request #106502 · ClickHouse/ClickHouse · GitHub
Skip to content

Speed up DISTINCT in order, sort-merge joins, LIMIT BY and negative LIMIT BY#106502

Merged
nihalzp merged 47 commits into
ClickHouse:masterfrom
nihalzp:optimize-equal-run-end-search
Jun 25, 2026
Merged

Speed up DISTINCT in order, sort-merge joins, LIMIT BY and negative LIMIT BY#106502
nihalzp merged 47 commits into
ClickHouse:masterfrom
nihalzp:optimize-equal-run-end-search

Conversation

@nihalzp

@nihalzp nihalzp commented Jun 4, 2026

Copy link
Copy Markdown
Member

The PR tries to optimize the scenario when we have a sorted IColumn and we want to find the end row number of equal key in the column. In many places, we do normal linear scan and in some places, we do binary search but have duplicate implementation. This PR unifies this utility it as a IColumn method and allows a single optimized binary search implementation for all use cases.

image image image image

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Speed up operators that scan a sorted stream for runs of equal key values — DISTINCT in order, LIMIT BY in order, negative LIMIT BY in order, full_sorting_merge and partial_merge joins.

@clickhouse-gh

clickhouse-gh Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-performance Pull request with some performance improvements label Jun 4, 2026
@clickhouse-gh

clickhouse-gh Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

📊 Cloud Performance Report

✅ AI verdict: no_change — no significant changes across 38 queries analysed

This PR adds a galloping plus binary-search fast path for finding equal-value runs in sorted columns (getEqualRangeEndAssumeSorted), wired only into sort-merge join equal-length detection and the LIMIT BY / negative-LIMIT-BY sorted-stream transforms. ClickBench queries use neither merge joins nor LIMIT BY, so the flagged improvements on Q4, Q28, and Q32 cannot be attributed to this change and read as run-to-run variance across the two builds; they have been downgraded. Notably the tpch_adapted_1_official suite, which is where join and ordering changes would actually surface, showed no per-query shifts, reinforcing that the ClickBench deltas are noise rather than a real effect of this PR.

clickbench

⚠️ 3 inconclusive

Flagged queries (3 of 43)
Query Verdict Baseline median (ms) PR median (ms) Change q-value Hint
⚠️ 4 not_sure 261 229 -12.3% <0.0001 ClickBench has no LIMIT BY and no merge joins; this PR's equal-range fast path isn't exercised. Off-path, run-to-run var
⚠️ 28 not_sure 6973 6613 -5.2% 0.0006 Off-path: PR only touches sort-merge join and LIMIT BY equal-range scanning, which this ClickBench query doesn't run. Va
⚠️ 32 not_sure 1343 1269 -5.5% <0.0001 Already not_sure; noisy (high variance, the two tests disagree). PR's merge-join/LIMIT BY change is off-path for ClickBe

q-value = BH-FDR adjusted p; smaller is stronger evidence. MIRAI flags a query when q < fdr_q (default 0.10) — the value the verdict is based on.

tpch_adapted_1_official

🟢 No significant changes

Debug info
  • StressHouse run: 22fc6d18-9782-4df6-9a06-0ef9338b2c8a
  • MIRAI run: b802e90d-8b37-474a-899d-dee612136e7e
  • PR check IDs:
    • clickbench_39534_1782392618
    • clickbench_39548_1782392618
    • clickbench_39554_1782392618
    • tpch_adapted_1_official_39561_1782392618
    • tpch_adapted_1_official_39572_1782392618
    • tpch_adapted_1_official_39582_1782392618

@nihalzp nihalzp changed the title Speed up DISTINCT in order, sort-merge joins, ORDER BY ... WITH FILL and LIMIT BY Speed up DISTINCT in order, sort-merge joins, ORDER BY ... WITH FILL, LIMIT BY and negative LIMIT BY Jun 5, 2026
Comment thread tests/performance/full_sorting_merge_join.xml
@nihalzp nihalzp mentioned this pull request Jun 5, 2026
1 task
Comment thread src/Processors/Transforms/NegativeLimitByTransform.cpp Outdated
Comment thread src/Columns/ColumnLowCardinality.cpp
Comment thread src/Columns/ColumnUnique.h Outdated
@nihalzp nihalzp requested a review from Algunenano June 22, 2026 09:07

@Algunenano Algunenano left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nihalzp It seems #106502 (comment) and #106502 (comment) are still unaddressed.

You mentioned Okay, I will keep it safe and not apply it for Float LC. but didn't change it AFAICT.

ColumnLowCardinality::getEqualRangeEndAssumeSorted would need to deal with floats, and nullable(float). And after that uniqueInsertCanonicalFloat is unused and should be removed

@nihalzp

nihalzp commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

You mentioned Okay, I will keep it safe and not apply it for Float LC. but didn't change it AFAICT.

I requested review before I made that comment and have not gotten chance to work on it yet :'. I will work on it soon.

@clickhouse-gh

clickhouse-gh Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.30% 85.30% +0.00%
Functions 92.60% 92.60% +0.00%
Branches 77.50% 77.50% +0.00%

Changed lines: Changed C/C++ lines covered by tests: 482/490 (98.37%) | Lost baseline coverage (was covered on master, now uncovered in this PR): 5 line(s) · Uncovered code

Full report · Diff report

@nihalzp nihalzp added this pull request to the merge queue Jun 25, 2026
Merged via the queue into ClickHouse:master with commit 9f7b790 Jun 25, 2026
328 of 330 checks passed
@nihalzp nihalzp deleted the optimize-equal-run-end-search branch June 25, 2026 16:16
@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-performance Pull request with some performance improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants