iframe-proxy

alexey-milovidov · 2026-05-01T07:49:25Z

The auto-spilling join (#97813) could still hit MEMORY_LIMIT_EXCEEDED even when max_bytes_before_external_join was set. With a stats-warm cache, the user's transcript

SET max_memory_usage = '10G';
SET max_bytes_before_external_join = '5G';
SELECT count() FROM
  (SELECT number AS id, randomString(100) AS s FROM numbers(200000000)) AS a
  JOIN (SELECT number AS id FROM numbers(200000000)) AS b USING (id);

failed at peak 12.5 GiB while attempting an 8 GiB chunk allocation, even though the threshold was supposed to cap memory well below 10 GiB.

Three independent issues conspired:

ConcurrentHashJoin::reserveSpaceInHashMaps preallocated capacity for hint->ht_size rows (e.g. 200M from a previous run) up front, blowing past the threshold before SpillingHashJoin could check it. Now we cap entries by max_bytes_before_external_join / (8 * cell_size) so the preallocated buffer stays under half the threshold while still keeping the optimization for queries that fit.
SpillingHashJoin::addBlockToJoin checked the threshold after the inner addBlockToJoin returned. Hash tables grow in power-of-two steps and a doubling from X to 2X transiently holds 3X, so the inner call could OOM during the resize before the post-call check ran. The check now runs proactively before the inner call, and uses half the threshold so the next doubling cannot push past the cap.
GraceHashJoin::hasMemoryOverflow only spilled on max_rows_in_join/max_bytes_in_join. When the wrapper handed data over, the in-memory bucket grew unbounded and the wrapper's spill decision became meaningless. It now also rehashes when total_bytes reaches half of max_bytes_before_external_join, gated by a 64 MiB minimum so unrealistically tiny test thresholds don't blow grace_hash_join_max_buckets.

Additionally, after rehashing, the new in-memory join was created with reserve_num = prev_keys_num, which itself allocated a power-of-two buffer for the pre-rehash size — another multi-gigabyte chunk. Reserve is now prev_keys_num / buckets_snapshot.size(), matching the rows that actually stay in this bucket.

The user's transcript: https://pastila.nl/?000b64fe/c41bd280359481e9fdf4c2b285b31943

Verification (peak observed via clickhouse-client --memory-usage=default):

Existing 03915_spilling_hash_join and 04061_spilling_hash_join_overflow_limits still pass against the modified server.

Changelog category (leave one):

Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Keep the auto-spilling hash join's actual memory peak under max_bytes_before_external_join. Previously, statistics-driven preallocation, in-place hash table doubling, and unbounded GraceHashJoin in-memory buckets could each push the query past the configured cap and trip MEMORY_LIMIT_EXCEEDED.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

Version info

Merged into: 26.5.1.274
Backported to: 26.4.1.1136

The auto-spilling join could still hit `MEMORY_LIMIT_EXCEEDED` even when `max_bytes_before_external_join` was set. With a stats-warm cache, e.g. running the user's transcript: SET max_memory_usage = '10G'; SET max_bytes_before_external_join = '5G'; SELECT count() FROM (SELECT number AS id, randomString(100) AS s FROM numbers(200000000)) AS a JOIN (SELECT number AS id FROM numbers(200000000)) AS b USING (id); failed at peak `12.5 GiB` while attempting an `8 GiB` chunk allocation, even though the threshold was supposed to cap it well below `10 GiB`. Three independent issues conspired: 1. `ConcurrentHashJoin::reserveSpaceInHashMaps` preallocated capacity for `hint->ht_size` rows (e.g. 200M from a previous run) up front, blowing past the threshold before `SpillingHashJoin` could check it. Now we cap the entries by `max_bytes_before_external_join / (8 * cell_size)` so the preallocated buffer stays under half the threshold while still keeping the optimization for queries that fit. 2. `SpillingHashJoin::addBlockToJoin` checked the threshold after the inner `addBlockToJoin` returned. Hash tables grow in power-of-two steps and a doubling from X to 2X transiently holds 3X, so the inner call could OOM during the resize before the post-call check ever ran. The check now runs proactively before the inner call, and uses half the threshold so the next doubling cannot push past the cap. 3. `GraceHashJoin::hasMemoryOverflow` only spilled on `max_rows_in_join` / `max_bytes_in_join`. When the wrapper handed data over, the in-memory bucket grew unbounded and the wrapper's spill decision became meaningless. It now also rehashes when `total_bytes` reaches half of `max_bytes_before_external_join`, gated by a 64 MiB minimum so unrealistically tiny test thresholds don't blow `grace_hash_join_max_buckets`. Additionally, after rehashing, the new in-memory join was created with `reserve_num = prev_keys_num`, which allocated a power-of-two buffer for the pre-rehash size - itself another multi-gigabyte chunk. Reserve is now `prev_keys_num / buckets_snapshot.size()`, matching the rows that actually stay in this bucket after the rehash. With all four together the user's query completes (200M result, peak `6.5 GiB`, well under the `10 GiB` cap). Existing `03915_spilling_hash_join` and `04061_spilling_hash_join_overflow_limits` still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

clickhouse-gh · 2026-05-01T07:50:00Z

Workflow [PR], commit [0d8e9ca]

Summary: ✅

AI Review

Summary

The PR fixes real memory-threshold enforcement gaps in the SpillingHashJoin → GraceHashJoin path (preallocation cap, proactive spill checks, terminal-block handling, and wrapper-only threshold wiring), and adds targeted regressions. I did not find additional blocker/major issues beyond already-raised inline review points.

ClickHouse Rules

Item	Status	Notes
Deletion logging	➖
Serialization versioning	➖
Core-area scrutiny	✅
No test removal	✅
Experimental gate	➖
No magic constants	✅
Backward compatibility	✅
`SettingsChangesHistory.cpp`	➖
PR metadata quality	✅
Safe rollout	✅
Compilation time	✅
No large/binary files	✅

Final Verdict

Status: ✅ Approve

alexey-milovidov

The code looks good to me.

`rehashBuckets` doubles the bucket count from `N` to `2N`, so the rows in the current bucket are split between bucket `i` and bucket `i + N` under the new modulus, with about half staying in this bucket. Reserve `prev_keys_num / 2` rather than `prev_keys_num / buckets_snapshot.size()`, which underestimated by a factor of `N` (e.g. with 128 buckets it reserved ~1/128 instead of ~1/2) and would force repeated re-grow/rehash cycles in the inner `HashJoin`, regressing the build path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds `04123_spilling_hash_join_peak_under_cap`. The test runs an INNER JOIN whose right-side build payload (~140 MiB of random strings) is well above `max_bytes_before_external_join`, with a strict `max_memory_usage` that previously could have been exceeded by the combination of statistics-driven preallocation, the SpillingHashJoin threshold check running after the inner doubling, and the GraceHashJoin not yet enforcing `max_bytes_before_external_join`. The query must now complete on both the single-thread `hash` and the concurrent `parallel_hash` paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

clickhouse-gh · 2026-05-01T19:13:18Z

+            return chosen_join->addBlockToJoin(block, check_limits);

-        return true;
+        return concurrent_join->addBlockToJoin(block, check_limits);


Moving the spill check to pre-insert avoids the resize-time OOM, but removing the post-insert check introduces a missed-switch case: if the last build block takes memory from below threshold / 2 to above it, SpillingHashJoin never switches because there is no next addBlockToJoin call.

Concrete trace: with max_bytes_before_external_join = 80 MiB, suppose getTotalByteCount() is 39 MiB before the final block. Pre-check passes (39 * 2 < 80). After inserting the final block, usage can be 60+ MiB (or even above 80 MiB), but we return immediately and onBuildPhaseFinish promotes the in-memory join without re-checking the threshold.

Please keep the pre-check and add a post-insert re-check (or an equivalent check in onBuildPhaseFinish) so terminal blocks cannot bypass spilling.

The earlier scale (1M rows × 140-byte strings, `max_memory_usage = 200Mi`, `max_bytes_before_external_join = 80Mi`) did not actually trip `MEMORY_LIMIT_EXCEEDED` on master: peak stayed around 175 MiB and the hash table never reached the spill threshold, so the test passed on the unfixed code as well as the fixed one. Bugfix validation flagged this as "Failed to reproduce the bug": https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=103838&sha=1c320f5a08784eb2c657ac12282d0c94268f3f89&name_0=PR&name_1=Bugfix%20validation%20%28functional%20tests%29 PR: #103838 The new scale (3M integer keys, `max_memory_usage = 256Mi`, `max_bytes_before_external_join = 64Mi`) hits the actual doubling pathology: on master the build-side hash table doubles past the threshold and tries to allocate a `~515 MiB` chunk, exceeding `max_memory_usage`, while on the fixed code `SpillingHashJoin` switches to `GraceHashJoin` before the doubling and the peak stays around `~150 MiB`. Verified locally against the master binary at the merge base (`b7507c4b7b3`) and the PR binary for SHA `1c320f5`: master, hash, 3M: fails with `659.49 MiB` peak master, parallel_hash, 3M: fails with `663.66 MiB` peak fixed, hash, 3M: passes with `149.11 MiB` peak fixed, parallel_hash, 3M: passes with `156.12 MiB` peak Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The proactive pre-insert check in `SpillingHashJoin::addBlockToJoin` only fires on a *subsequent* call. If the very last build block crosses `max_bytes_before_external_join` (a single huge block, or a doubling that takes the table from below `threshold/2` to above `threshold`), the post-insert state is over the cap and `onBuildPhaseFinish` promotes the in-memory join as-is instead of switching to `GraceHashJoin`. Add a final check in `onBuildPhaseFinish`: if the build-phase total already exceeds the threshold, switch to `GraceHashJoin` instead of promoting `HashJoin` / `ConcurrentHashJoin`. This restores the "`max_bytes_before_external_join` decides whether to spill" contract for the terminal-block edge case. Raised in PR review: #103838 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

clickhouse-gh · 2026-05-02T05:36:48Z

+    {
+        static constexpr size_t MIN_THRESHOLD_FOR_AUTOMATIC_SPILL = 64 * 1024 * 1024;
+        const auto external_join_threshold = table_join->maxBytesBeforeExternalJoin();
+        if (external_join_threshold >= MIN_THRESHOLD_FOR_AUTOMATIC_SPILL && total_bytes * 2 >= external_join_threshold)


This introduces a hard cutoff at 64 MiB, so max_bytes_before_external_join is no longer enforced for any lower value.

Concrete case: with max_bytes_before_external_join = 32 MiB, SpillingHashJoin can switch to GraceHashJoin, but this check stays disabled (32 < 64), so the in-memory bucket can keep growing past 32 MiB until some other limit is hit. That breaks the setting contract and reopens the same memory-peak issue for low-but-valid thresholds.

Please remove the fixed 64 MiB gate and apply the threshold check for any positive max_bytes_before_external_join (or make the lower bound configurable as a setting if we really need one).

The previous 64 MiB hardcoded floor on the `max_bytes_before_external_join` check inside `GraceHashJoin::hasMemoryOverflow` silently disabled the cap for any lower value. A user setting `max_bytes_before_external_join = 32 MiB` would still see in-memory buckets grow well past the configured limit, breaking the setting contract. The floor exists because the rehash only halves a bucket; for total data `D` and threshold `T` we need ~`log2(D / T)` rehashes per bucket (capped by `grace_hash_join_max_buckets`). With pathologically small thresholds (e.g. 1 KiB in `03915_spilling_hash_join`) the per-row hash cell overhead (~32 bytes) alone exceeds `T/2` before any rehash can help, forcing one rehash per added block and quickly exhausting the bucket cap. 1 MiB is small enough to enforce the contract for any practical configuration (the AI-flagged 32 MiB case is now respected) while still leaving low test thresholds unaffected. Existing tests `03915_spilling_hash_join`, `04061_spilling_hash_join_overflow_limits`, and `04123_spilling_hash_join_peak_under_cap` continue to pass. Add `04146_spilling_hash_join_low_threshold` as a regression for a threshold (`16 MiB`) below the previous floor: with `max_memory_usage = 160 MiB`, the unfixed code peaked at 167 MiB during the doubling step, while the fixed code stays around 131 MiB. Raised in PR review: #103838 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`04146_spilling_hash_join_low_threshold` was failing with `MEMORY_LIMIT_EXCEEDED` during the build phase even after the `SpillingHashJoin` pre-insert check landed. The same problem the `SpillingHashJoin` fix solved for the wrapper still existed one level down: `GraceHashJoin::addBlockToJoinImpl` checked `hasMemoryOverflow` *after* `hash_join->addBlockToJoin` returned, so the in-memory hash table inside the bucket could double its buffer past the cap before the rehash decision ran. Move the check before the inner `addBlockToJoin` using projected `(rows, allocatedBytes)`. When the projection already triggers `hasMemoryOverflow`, skip the inner add and include the un-added block in the rescatter pass so the data is preserved and redistributed to the new bucket layout. CI: #103838

The CI runs with random per-test settings injection. The test caps memory at `max_memory_usage = 160Mi` to make the absence of the `64 MiB` floor in `GraceHashJoin::hasMemoryOverflow` visible: with the fix in place, the build side peaks at around `131 MiB`, leaving a `~29 MiB` margin. That margin is too narrow to absorb the memory overhead added by random settings such as `filesystem_prefetch_max_memory_usage 32Mi`, `temporary_files_buffer_size`, and friends, so the cap was being tripped during the build phase even with the fix applied. The test failed in CI: #103838 with `MEMORY_LIMIT_EXCEEDED` at peak `~160 MiB` across multiple shards. Add `no-random-settings, no-random-merge-tree-settings` so the regression remains deterministic. The setting tag is strictly necessary here because the very setting under test (`max_bytes_before_external_join`) is among those randomized, and the tight memory cap is intentional.

clickhouse-gh · 2026-05-04T03:38:31Z

+        /// will actually hold (which adds cell overhead and load-factor padding), but it is
+        /// enough for the pre-check to fire one rehash earlier than the post-check would.
+        const size_t projected_keys = prev_keys_num + current_block.rows();
+        const size_t projected_bytes = pre_total_bytes + current_block.allocatedBytes();


projected_bytes = pre_total_bytes + current_block.allocatedBytes() is only a lower bound and can miss the same resize-time OOM pattern this change is trying to prevent.

Concrete trace with max_bytes_before_external_join = 80 MiB:

Before insert: pre_total_bytes = 39 MiB (39 * 2 < 80), bucket is already near maxFill.

Incoming current_block is tiny (e.g. one row), so projected_bytes stays < 40 MiB, and the pre-check does not rehash.

hash_join->addBlockToJoin crosses maxFill, triggers X -> 2X growth, and can hit the transient ~3X peak inside the allocator before the post-insert check runs.

So a tiny block can still bypass the pre-check and fail at allocation time. Please base the pre-check on current hash-table occupancy/capacity growth risk (for example, a getTotalByteCount() * 2 >= threshold style check, as done in SpillingHashJoin) instead of current_block.allocatedBytes().

The previous pre-check used `pre_total_bytes + current_block.allocatedBytes()` fed into `hasMemoryOverflow`, which has two problems: 1. `block.allocatedBytes()` is a misleading lower bound on the in-memory hash table cost: cell overhead, load-factor padding, and the resize-time peak are not represented. A tiny block can leave a near-full bucket, with the projected sum still under the cap, then `addBlockToJoin` triggers an X→2X buffer doubling and OOMs in the allocator before the post-add check runs. 2. `hasMemoryOverflow` also performs `softCheck` against `max_bytes_in_join`, so the projected check fired against very small soft limits used in tests (e.g. `max_bytes_in_join = '10K'` in `02274_full_sort_join_nodistinct`). Combined with `block.allocatedBytes()` rounding, the pre-check rehashed on every block and quickly hit `grace_hash_join_max_buckets`. The pre-check now mirrors `SpillingHashJoin::addBlockToJoin`: rehash when the existing bucket alone is already past half of `max_bytes_before_external_join`, gated by the same 1 MiB floor (`MIN_THRESHOLD_FOR_AUTOMATIC_SPILL`, lifted to file scope so both call sites share it). Soft limits stay enforced by the post-insert call to `hasMemoryOverflow`, so behaviour is unchanged when `max_bytes_before_external_join` is unset. CI report (failure on `02274_full_sort_join_nodistinct`): https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=103838&sha=0025061ddbeaca0d44371f337595d27e86064c17&name_0=PR&name_1=Stateless%20tests%20%28amd_debug%2C%20distributed%20plan%2C%20s3%20storage%2C%20parallel%29 PR: #103838 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

clickhouse-gh · 2026-05-04T07:09:24Z

+    /// buffer while the old one is still alive, transiently using 3X memory. We must trigger the
+    /// switch BEFORE the inner `addBlockToJoin` runs (and possibly doubles the buffer); a check
+    /// that runs after the call would race with the doubling and observe the OOM only as an
+    /// allocator exception. Threshold is half of `max_bytes_before_external_join` so that after


This proactive pre-insert threshold check changes the semantics from "we already exceeded memory" to "we are approaching the spill threshold".

But switchToGraceHashJoin still logs Memory limit exceeded ... switching to GraceHashJoin.

That message is now misleading in normal (non-exceptional) switching paths and can send incident triage in the wrong direction. Please update the log text to something like Memory spill threshold reached.

antaljanosbenjamin · 2026-05-04T09:24:38Z

+/// any rehash can help, forcing one rehash per added block and quickly hitting the bucket
+/// cap. 1 MiB enforces the contract for any practical configuration while still leaving low
+/// test thresholds (e.g. 1 KiB in `03915_spilling_hash_join`) unaffected.
+static constexpr size_t MIN_THRESHOLD_FOR_AUTOMATIC_SPILL = 1024 * 1024;


I don't like the fact that we are adding this artificial limit just to keep the tests working. Basically the tests now test a different behavior then in real world use cases. I would much rather prefer changing those tests.

antaljanosbenjamin · 2026-05-04T09:38:58Z

+        /// `SpillingHashJoin` ever runs its threshold check. We still want preallocation - just bounded by
+        /// the memory budget. We aim for about half of the threshold so that the cap itself plus the live
+        /// data plus the conversion peak still fit under it.
+        const auto external_join_threshold = hash_join.getTableJoin().maxBytesBeforeExternalJoin();


I think it would be better if we would separate max_bytes_before_external_join and maximum size that concurrent hash join can reserve. E.g.: join_algorithm = 'parallel_hash', max_bytes_before_external_join = 10MB will cause concurrent hash performance to plummet. Let's make max reserve size as a parameter of ConcurrentHash join.

antaljanosbenjamin · 2026-05-04T10:07:59Z

+    /// on `MIN_THRESHOLD_FOR_AUTOMATIC_SPILL` for why a small floor is required.
+    if (!has_overflow)
+    {
+        const auto external_join_threshold = table_join->maxBytesBeforeExternalJoin();


Same here as for concurrent hash join: join_algorithm = 'grace_hash', max_bytes_before_external_join = 10MB affects grace hash join when it shouldn't.

The pre-insert threshold check in `addBlockToJoin` switches before any memory limit is actually exceeded, so the previous "Memory limit exceeded ... switching to GraceHashJoin" log line was misleading for normal threshold-triggered switches and could mislead incident triage. Reword to "Memory spill threshold reached ..." to reflect the new semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

clickhouse-gh · 2026-05-04T17:41:27Z

+    /// rehashing buckets early prevents that doubling from exceeding the cap. See the comment
+    /// on `MIN_THRESHOLD_FOR_AUTOMATIC_SPILL` for why a small floor is required.
+    if (!has_overflow)
+    {


max_bytes_before_external_join is documented and serialized as applying to hash/parallel_hash auto-conversion paths, not to explicit join_algorithm = 'grace_hash' (src/Core/Settings.cpp and src/Processors/QueryPlan/QueryPlanSerializationSettings.cpp).

This new check runs inside GraceHashJoin unconditionally, so users who intentionally picked grace_hash now get different spill behavior whenever max_bytes_before_external_join is set (including inherited profile settings). That is a behavior-contract break and can cause unexpected extra spilling/perf regressions in existing deployments.

Please gate this branch to the wrapper path only (e.g. only when GraceHashJoin is used as a spill target from SpillingHashJoin) instead of applying it to standalone grace_hash.

`max_bytes_before_external_join` was being read directly inside `ConcurrentHashJoin::reserveSpaceInHashMaps` and `GraceHashJoin::hasMemoryOverflow`, which meant the new cap also fired for users who explicitly picked `join_algorithm = 'parallel_hash'` or `'grace_hash'` standalone. The setting is documented as governing the auto-conversion path (`join_algorithm = 'auto'` or `'hash'` / `'parallel_hash'` with the threshold set), so applying it inside those joins unconditionally was a behavior-contract break - a `parallel_hash` build with a low `max_bytes_before_external_join` would lose its statistics- driven preallocation, and a standalone `grace_hash` would change spill timing. Pass the threshold explicitly via a new constructor parameter `external_join_threshold_` on both joins, set only when `SpillingHashJoin` constructs them. When 0 (standalone use), the new caps are skipped and the prior behavior is preserved. Also remove `MIN_THRESHOLD_FOR_AUTOMATIC_SPILL`. The 1 MiB floor existed solely so that the unconditional check would not break tests with very low thresholds; now that the check is wrapper-only, the floor is no longer needed. Update `03915_spilling_hash_join` to use a realistic 100 KiB threshold for the single-thread spill block (the old `1000`-byte value is below any reasonable production setting and kept the suite passing only because of the floor); the spilled queries now also produce delayed blocks, matching the concurrent-path tests.

clickhouse-gh · 2026-05-05T06:52:01Z

LLVM Coverage Report

Changed lines: 98.71% (153/155) | lost baseline coverage: 1 line(s) · Uncovered code

Full report · Diff report

Cherry pick #103838 to 26.4: Keep `max_bytes_before_external_join` peak under the configured cap

…under the configured cap

Backport #103838 to 26.4: Keep `max_bytes_before_external_join` peak under the configured cap

clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label May 1, 2026

alexey-milovidov requested a review from antaljanosbenjamin May 1, 2026 07:52

alexey-milovidov added the v26.4-must-backport label May 1, 2026

alexey-milovidov commented May 1, 2026

View reviewed changes

clickhouse-gh Bot reviewed May 1, 2026

View reviewed changes

Comment thread src/Interpreters/GraceHashJoin.cpp Outdated

alexey-milovidov and others added 3 commits May 1, 2026 18:59

Merge remote-tracking branch 'origin/master' into fix-spilling-join-peak

1c320f5

clickhouse-gh Bot reviewed May 1, 2026

View reviewed changes

alexey-milovidov and others added 2 commits May 2, 2026 05:33

clickhouse-gh Bot reviewed May 2, 2026

View reviewed changes

alexey-milovidov and others added 6 commits May 2, 2026 20:54

Merge remote-tracking branch 'origin/master' into fix-spilling-join-peak

a79e447

Merge remote-tracking branch 'origin/master' into fix-spilling-join-peak

8ed6aef

Merge remote-tracking branch 'origin/master' into fix-spilling-join-peak

0025061

clickhouse-gh Bot reviewed May 4, 2026

View reviewed changes

antaljanosbenjamin self-assigned this May 4, 2026

antaljanosbenjamin reviewed May 4, 2026

View reviewed changes

clickhouse-gh Bot reviewed May 4, 2026

View reviewed changes

alexey-milovidov added 2 commits May 5, 2026 03:41

Merge remote-tracking branch 'origin/master' into fix-spilling-join-peak

4c6da6c

antaljanosbenjamin approved these changes May 5, 2026

View reviewed changes

antaljanosbenjamin added this pull request to the merge queue May 5, 2026

Merged via the queue into master with commit 9a0a52e May 5, 2026
165 checks passed

antaljanosbenjamin deleted the fix-spilling-join-peak branch May 5, 2026 09:14

robot-clickhouse-ci-1 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label May 5, 2026

robot-ch-test-poll mentioned this pull request May 5, 2026

Cherry pick #103838 to 26.4: Keep max_bytes_before_external_join peak under the configured cap #104084

Merged

robot-ch-test-poll added a commit that referenced this pull request May 5, 2026

Merge pull request #104084 from ClickHouse/cherrypick/26.4/103838

eceed8d

Cherry pick #103838 to 26.4: Keep `max_bytes_before_external_join` peak under the configured cap

robot-clickhouse added a commit that referenced this pull request May 5, 2026

Backport #103838 to 26.4: Keep max_bytes_before_external_join peak …

deacf26

…under the configured cap

robot-ch-test-poll mentioned this pull request May 5, 2026

Backport #103838 to 26.4: Keep max_bytes_before_external_join peak under the configured cap #104085

Merged

robot-ch-test-poll added the pr-synced-to-cloud The PR is synced to the cloud repo label May 5, 2026

robot-clickhouse added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label May 5, 2026

alexey-milovidov added a commit that referenced this pull request May 5, 2026

Merge pull request #104085 from ClickHouse/backport/26.4/103838

5cb5208

Backport #103838 to 26.4: Keep `max_bytes_before_external_join` peak under the configured cap

scenario	result	peak
baseline (no limits)	200M rows, 4.2s	17.4 GiB
`max_memory_usage = 10G` only	fails (matches transcript)	9.31 GiB
`max_memory_usage = 10G` + `max_bytes_before_external_join = 5G`	succeeds, 28.2s	6.5 GiB

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

alexey-milovidov commented May 1, 2026 • edited by robot-clickhouse Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Documentation entry for user-facing changes

Version info

Uh oh!

clickhouse-gh Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Review

Summary

ClickHouse Rules

Final Verdict

Uh oh!

alexey-milovidov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

clickhouse-gh Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

antaljanosbenjamin May 4, 2026

Choose a reason for hiding this comment

Uh oh!

antaljanosbenjamin May 4, 2026

Choose a reason for hiding this comment

Uh oh!

antaljanosbenjamin May 4, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot commented May 5, 2026

LLVM Coverage Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

alexey-milovidov commented May 1, 2026 •

edited by robot-clickhouse

Loading

clickhouse-gh Bot commented May 1, 2026 •

edited

Loading