Add `SHUFFLE` clause support for `SELECT` by xuir268 · Pull Request #100305 · ClickHouse/ClickHouse · GitHub
Skip to content

Add SHUFFLE clause support for SELECT#100305

Closed
xuir268 wants to merge 31 commits into
ClickHouse:masterfrom
xuir268:shuffle-clause
Closed

Add SHUFFLE clause support for SELECT#100305
xuir268 wants to merge 31 commits into
ClickHouse:masterfrom
xuir268:shuffle-clause

Conversation

@xuir268

@xuir268 xuir268 commented Mar 21, 2026

Copy link
Copy Markdown

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Add the SHUFFLE clause for SELECT queries, including optimized SHUFFLE LIMIT execution, query plan support, tests, and documentation.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@CLAassistant

CLAassistant commented Mar 21, 2026

Copy link
Copy Markdown

@xuir268

xuir268 commented Mar 21, 2026

Copy link
Copy Markdown
Author

Hi @alexey-milovidov , if you have time, please take a look. if need anything more, please do let me know

@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label Mar 21, 2026
@clickhouse-gh

clickhouse-gh Bot commented Mar 21, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [f473327]

@clickhouse-gh clickhouse-gh Bot added the pr-feature Pull request with new product feature label Mar 21, 2026
Comment thread src/Parsers/ParserSelectQuery.cpp
Comment thread src/Processors/Transforms/ShuffleTransform.cpp
@shankar-iyer shankar-iyer self-assigned this Mar 22, 2026
Switch `SHUFFLE LIMIT` to bounded reservoir sampling, add the experimental `allow_experimental_shuffle_query` setting, update docs/tests, and fix style issues from the CI report.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100305&sha=latest&name_0=PR&name_1=Style+check
Comment thread src/Parsers/ExpressionElementParsers.cpp
Remove the global alias restriction for `SHUFFLE` so disabled mode keeps legacy alias parsing, and update regression tests to use unambiguous clause forms.

PR comment: ClickHouse#100305
Comment thread src/Planner/Planner.cpp
Apply the  optimization only for positive limits, add  to , normalize , and add a stateless regression test for .

PR comment: ClickHouse#100305
Comment thread docs/en/sql-reference/statements/select/shuffle.md Outdated
Add  to the first runnable  example so it works under default settings.

PR comment: ClickHouse#100305
xuir268 added 2 commits March 22, 2026 18:51
Keep alias-behavior checks in  and move the expected  query into a dedicated error-only test so  does not fail the whole file with return code .

Fast test report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100305&sha=0095e031faec3babd928dbe81c9b6c436a116255&name_0=PR&name_1=Fast%20test&name_2=Tests
- Remove trailing whitespace on line 633 of `CommonParsers.h`
- Move `-- { serverError SUPPORT_IS_DISABLED }` inline with the semicolon in `04054_shuffle_disabled_error.sql` so the test runner correctly associates the expected error with the query

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100305&sha=12f6e74bde27f51c8223a68221c984c4581c3ce1&name_0=PR&name_1=Fast%20test
Comment thread src/Interpreters/TreeRewriter.cpp Outdated
When `allow_experimental_shuffle_query = 1` but `allow_experimental_analyzer = 0`,
`TreeRewriter` previously passed the guard and let the query proceed. The old planner
has no `ShuffleStep`, so `SHUFFLE` was silently ignored and non-randomized results
were returned — a correctness bug.

Fix: after the `SUPPORT_IS_DISABLED` gate, always throw `NOT_IMPLEMENTED` in the old
planner path, matching the pattern used for `LIMIT BY ALL`.

Add `04055_shuffle_old_interpreter` to cover the setting-enabled / old-planner case.
Comment thread docs/en/sql-reference/statements/select/shuffle.md Outdated
Comment thread docs/en/sql-reference/statements/select/shuffle.md Outdated
@xuir268

xuir268 commented Apr 22, 2026

Copy link
Copy Markdown
Author

Hi @alexey-milovidov @shankar-iyer is anything more is left to review

@rschu1ze

Copy link
Copy Markdown
Member

@xuir268 Expect delay please, people out on a business trip at the moment.

Comment thread docs/en/sql-reference/statements/select/shuffle.md Outdated
Comment thread docs/en/sql-reference/statements/select/shuffle.md Outdated
Comment thread docs/en/sql-reference/statements/select/shuffle.md Outdated
Comment thread docs/en/sql-reference/statements/select/shuffle.md Outdated
Comment thread docs/en/sql-reference/statements/select/shuffle.md Outdated
Comment thread docs/en/sql-reference/statements/select/shuffle.md Outdated
Explain `SHUFFLE` primarily as result row order randomization, make `AS SHUFFLE` alias usage explicit, and describe `SHUFFLE LIMIT n` as the random-subset form without overemphasizing sampling terminology.

PR comment: ClickHouse#100305
@xuir268 xuir268 requested a review from rschu1ze April 24, 2026 19:08
@xuir268

xuir268 commented May 2, 2026

Copy link
Copy Markdown
Author

Hi @alexey-milovidov @shankar-iyer is anything more is left to review

@shankar-iyer

Copy link
Copy Markdown
Member

Hi @alexey-milovidov @shankar-iyer is anything more is left to review

Hi @alexey-milovidov @shankar-iyer is anything more is left to review

I will catch up on this PR this week.

@xuir268

xuir268 commented May 4, 2026

Copy link
Copy Markdown
Author

Is anything more left @rschu1ze

@rschu1ze

rschu1ze commented May 4, 2026

Copy link
Copy Markdown
Member

Is anything more left @rschu1ze

#100535 (comment)

@rschu1ze

rschu1ze commented May 4, 2026

Copy link
Copy Markdown
Member

I really appreciate your work @xuir268 but I will need to close this PR for these reasons (sorry):

  • the functionality of result order shuffling can already be achieved today using ORDER BY rand() [LIMIT n] with different (in some cases faster, in some cases slower) performance characteristics,
  • in practice, result row shuffling is rarely used,
  • the SHUFFLE syntax is not SQL standard (unlike ORDER BY rand()) and also not supported by other databases as far as I see, i.e. it would be ClickHouse-only.

For this niche advantage, it seems not worth to add and maintain many hundreds LOC of code going forward.

@rschu1ze rschu1ze closed this May 4, 2026
@xuir268

xuir268 commented May 5, 2026

Copy link
Copy Markdown
Author

@rschu1ze @shankar-iyer , Thanks for the review and for clarifying the final decision.

I want to be direct about one process issue: this outcome is frustrating because SHUFFLE had been mentioned in the roadmap, and the earlier review discussion focused on implementation details, tests, benchmarks, and documentation rather than on whether the feature itself should exist at all. If the conclusion is that a ClickHouse-specific SHUFFLE clause is not worth the long-term maintenance cost because ORDER BY rand() [LIMIT n] already covers the user-visible functionality, that would have been important to state much earlier.

I have to be honest: I do not think the review process was handled well here, because the product-level objection appears only after substantial implementation, benchmarking, testing, and documentation work had already been done.

I would appreciate it if future feature reviews call out that kind of product-level objection as early as possible. That would save a lot of unnecessary implementation and review time on both sides.

If some of the execution-side work here is still useful independently, for example around improving large-n random-row selection behavior, I would also appreciate guidance on whether that should be pursued in a different form.

@rschu1ze

rschu1ze commented May 5, 2026

Copy link
Copy Markdown
Member

@xuir268 Thanks for the feedback, I understand the frustration. I could actually not find SHUFFLE on the 2024, the 2025 or the 2026 roadmaps. DId I miss something?

If the conclusion is that a ClickHouse-specific SHUFFLE clause is not worth the long-term maintenance cost because ORDER BY rand() [LIMIT n] already covers the user-visible functionality, that would have been important to state much earlier.

Fair enought and sorry again. This PR didn't have an associated linked ticket initially and I somehow missed its motivation and context in the beginning.

If some of the execution-side work here is still useful independently

I'll check back if this makes sense, thanks!

@xuir268

xuir268 commented May 5, 2026

Copy link
Copy Markdown
Author

@rschu1ze here it is #87836 and can you also assign some tickets so I can look into more other issues or from roadmap , this will be highly appreciated

@rschu1ze

rschu1ze commented May 5, 2026

Copy link
Copy Markdown
Member

#87836 is the list of intern topics. These are technically not part of the roadmap because of their experimental nature.

can you also assign some tickets so I can look into more other issues or from roadmap , this will be highly appreciated

Feel free to pick any open ticket, we are experimenting with AI and getting lots of automatic bug reports (e.g.clickgapai)

@alexey-milovidov

Copy link
Copy Markdown
Member

Shuffle is obviously more performant than ORDER BY rand() (a different algorithm), and we will use it to automatically optimize ORDER BY rand().

@xuir268

xuir268 commented May 11, 2026

Copy link
Copy Markdown
Author

@shankar-iyer you can review the MR

@shankar-iyer

Copy link
Copy Markdown
Member

Shuffle is obviously more performant than ORDER BY rand() (a different algorithm), and we will use it to automatically optimize ORDER BY rand().

@shankar-iyer you can review the MR

My concern is still the memory required for a large LIMIT. I am compiling the PR and will give it a spin.

Comment thread src/Planner/Planner.cpp Outdated
@shankar-iyer

shankar-iyer commented May 12, 2026

Copy link
Copy Markdown
Member

I took the PR on my VM (48 CPU) and ran on a not too big dataset (100M rows).

select * from test_100m order by rand() limit 10 format null;
0 rows in set. Elapsed: 0.040 sec. Processed 100.00 million rows, 1.00 GB (2.53 billion rows/s., 25.29 GB/s.)
Peak memory usage: 2.29 MiB.


select * from test_100m shuffle limit 10 format null;
0 rows in set. Elapsed: 183.244 sec. Processed 100.00 million rows, 6.90 GB (545.72 thousand rows/s., 37.65 MB/s.)
Peak memory usage: 182.40 KiB.


select * from test_100m order by rand() limit 1000 format null
0 rows in set. Elapsed: 0.074 sec. Processed 100.06 million rows, 1.00 GB (1.35 billion rows/s., 13.54 GB/s.)
Peak memory usage: 7.04 MiB.

select * from test_100m shuffle limit 1000 format null;
0 rows in set. Elapsed: 4.518 sec. Processed 100.00 million rows, 7.00 GB (22.13 million rows/s., 1.55 GB/s.)
Peak memory usage: 254.91 KiB.


select * from test_100m order by rand() limit 1000000 format null
0 rows in set. Elapsed: 1.070 sec. Processed 147.42 million rows, 4.32 GB (137.74 million rows/s., 4.03 GB/s.)
Peak memory usage: 1.48 GiB.

 select * from test_100m shuffle limit 1000000 format null;
0 rows in set. Elapsed: 2.457 sec. Processed 100.00 million rows, 7.00 GB (40.70 million rows/s., 2.85 GB/s.)
Peak memory usage: 10.23 GiB.


select * from test_100m order by rand() limit 10000000 format null
0 rows in set. Elapsed: 4.663 sec. Processed 199.88 million rows, 7.99 GB (42.86 million rows/s., 1.71 GB/s.)
Peak memory usage: 2.66 GiB.

select * from test_100m shuffle limit 10000000 format null;
0 rows in set. Elapsed: 7.790 sec. Processed 100.00 million rows, 7.00 GB (12.84 million rows/s., 898.54 MB/s.)
Peak memory usage: 15.38 GiB.


select * from test_100m order by rand() limit 80000000 format null
0 rows in set. Elapsed: 34.417 sec. Processed 200.00 million rows, 8.00 GB (5.81 million rows/s., 232.44 MB/s.)
Peak memory usage: 13.08 GiB.

select * from test_100m shuffle limit 80000000 format null;
0 rows in set. Elapsed: 23.199 sec. Processed 100.00 million rows, 7.00 GB (4.31 million rows/s., 301.73 MB/s.)
Peak memory usage: 44.08 GiB.

At 80M limit, lazy materialization performs too many scattered reads and maybe also duplicate granule reads and hence starts slowing down. But note that SHUFFLE's memory usage is a function of the LIMIT + number of columns - essentially ruling out large dataset sampling.

There is a simple optimization possible in LazyMaterializingTransform - we don't really need to sort the final chunk of 10M or 80M rand() values. The purpose of the feature is to return a random sample of 80M rows from the dataset and that is achieved. That change can reduce the scattered reads (we just sort by _part_offset)

select * from test_100m order by rand() limit 10000000 format null settings query_plan_lazy_materialization_unordered_output = 1
0 rows in set. Elapsed: 3.264 sec. Processed 199.88 million rows, 7.99 GB (61.23 million rows/s., 2.45 GB/s.)
Peak memory usage: 2.57 GiB.

select * from test_100m order by rand() limit 80000000 format null settings query_plan_lazy_materialization_unordered_output = 1
0 rows in set. Elapsed: 18.799 sec. Processed 200.00 million rows, 8.00 GB (10.64 million rows/s., 425.55 MB/s.)
Peak memory usage: 12.10 GiB.

Can you reiterate again when the SHUFFLE implementation in this PR would perform better? Structurally, I feel difficult for a single pass algorithm to perform allround like 2-pass lazy materialization.

@xuir268

xuir268 commented May 18, 2026

Copy link
Copy Markdown
Author

@shankar-iyer I updated the SHUFFLE LIMIT implementation to use the lazy materialization path instead of the row-reservoir ShuffleStep.

Implementation-wise, SHUFFLE LIMIT n is planned through an ORDER BY rand() LIMIT n sampling path, but the SortingStep is marked as allowing unordered output. Then LazyMaterializingTransform sorts the selected row ids by _part_offset for the lazy read and skips restoring the final random-value order. So the returned rows are still a random sample, but SHUFFLE LIMIT n does not guarantee random output order. Plain SHUFFLE without LIMIT still uses ShuffleStep.

Local 10M-row MergeTree check, 3 selected columns, max_threads = 16, FORMAT Null:

query elapsed peak memory
ORDER BY rand() LIMIT 1000 0.311s 6.50 MiB
SHUFFLE LIMIT 1000 0.280s 969.60 KiB
ORDER BY rand() LIMIT 100000 0.409s 103.34 MiB
SHUFFLE LIMIT 100000 0.312s 107.86 MiB
ORDER BY rand() LIMIT 1000000 0.550s 235.92 MiB
SHUFFLE LIMIT 1000000 0.533s 242.94 MiB
ORDER BY rand() LIMIT 8000000 2.005s 809.41 MiB
SHUFFLE LIMIT 8000000 1.936s 799.29 MiB

So with the updated semantics, SHUFFLE LIMIT is useful when the user wants a random sample and does not require the sample itself to be returned in random order. If the final random ordering is required, then ORDER BY rand() LIMIT n is still the correct semantic match.

@shankar-iyer

Copy link
Copy Markdown
Member

@xuir268 Checking..

@shankar-iyer

shankar-iyer commented May 19, 2026

Copy link
Copy Markdown
Member

@shankar-iyer I updated the SHUFFLE LIMIT implementation to use the lazy materialization path instead of the row-reservoir ShuffleStep.

Implementation-wise, SHUFFLE LIMIT n is planned through an ORDER BY rand() LIMIT n sampling path, but the SortingStep is marked as allowing unordered output. Then LazyMaterializingTransform sorts the selected row ids by _part_offset for the lazy read and skips restoring the final random-value order. So the returned rows are still a random sample, but SHUFFLE LIMIT n does not guarantee random output order. Plain SHUFFLE without LIMIT still uses ShuffleStep.

Local 10M-row MergeTree check, 3 selected columns, max_threads = 16, FORMAT Null:

query elapsed peak memory
ORDER BY rand() LIMIT 1000 0.311s 6.50 MiB
SHUFFLE LIMIT 1000 0.280s 969.60 KiB
ORDER BY rand() LIMIT 100000 0.409s 103.34 MiB
SHUFFLE LIMIT 100000 0.312s 107.86 MiB
ORDER BY rand() LIMIT 1000000 0.550s 235.92 MiB
SHUFFLE LIMIT 1000000 0.533s 242.94 MiB
ORDER BY rand() LIMIT 8000000 2.005s 809.41 MiB
SHUFFLE LIMIT 8000000 1.936s 799.29 MiB
So with the updated semantics, SHUFFLE LIMIT is useful when the user wants a random sample and does not require the sample itself to be returned in random order. If the final random ordering is required, then ORDER BY rand() LIMIT n is still the correct semantic match.

Thanks for working through the iterations on this — the design exploration has been useful and our last 2 conversations have made the actual win clear, and I think the scope should shrink to match.

Looking at the current state: for SHUFFLE LIMIT n, the planner now rewrites the tree to ORDER BY rand() LIMIT n and the heavy lifting (heap-based top-n selection) is done entirely by the existing sorting infrastructure.

So the real optimization the work has surfaced is narrow but genuine: letting the final MergingSortedTransform skip the last sort plus the planner pass that detects ORDER BY rand() LIMIT n and sets it. That gives essentially all the value with none of the other surface area.

Given how many revisions we've gone through here, I think the cleanest path forward is to close this PR and open a focused replacement PR that does just:

  1. Pattern-match ORDER BY rand() LIMIT n in the planner.
  2. Set SortingStep::allow_unordered_output for that case, so the final merge can skip work (expensive for large LIMITs)
  3. [Needs research] The current placement of SHUFFLE in the SQL grammer conflicts with table alias and is not an option. You could explore tying the SHUFFLE with LIMIT -> SELECT * FROM <...> WHERE <...> LIMIT n SHUFFLE. This would mean dropping support for SHUFFLE without LIMIT - I believe that is fine. The planner can rewrite SHUFFLE LIMIT n to ORDER BY rand() LIMIT n.

@xuir268

xuir268 commented May 19, 2026

Copy link
Copy Markdown
Author

@shankar-iyer okay , opening dedicating MR for this focused replacement PR that does just:

Pattern-match ORDER BY rand() LIMIT n in the planner.
Set SortingStep::allow_unordered_output for that case, so the final merge can skip work (expensive for large LIMITs)
[Needs research] The current placement of SHUFFLE in the SQL grammer conflicts with table alias and is not an option. You could explore tying the SHUFFLE with LIMIT -> SELECT * FROM <...> WHERE <...> LIMIT n SHUFFLE. This would mean dropping support for SHUFFLE without LIMIT - I believe that is fine. The planner can rewrite SHUFFLE LIMIT n to ORDER BY rand() LIMIT n.

@xuir268

xuir268 commented May 24, 2026

Copy link
Copy Markdown
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors manual approve Manual approve required to run CI pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants