Fix heap-use-after-free in executeAggregateMultiply self-merge#103536
Conversation
The exponentiation-by-squaring loop in `FunctionBinaryArithmetic::
executeAggregateMultiply` squared the running aggregate state with
function->merge(vec_from[i], vec_from[i], arena.get());
passing the same state as both source and destination. For aggregate
functions whose `merge` implementation appends the source's internal
storage into the destination (`QuantileExact`, `groupArray`,
`groupUniqArray`, `windowFunnel`, `sequenceMatch`, etc.), the
destination reallocates its `PODArray` mid-merge and the source
iterators (referencing the same, now-freed buffer) are read by the
trailing `memcpy` in `PODArray::insert_assume_reserved` -> heap-use-
after-free. The hazard was already documented at
`src/Common/PODArray.h:482`: "Do not insert into the array a piece of
itself. Because with the resize, the iterators on themselves can be
invalidated."
Fix: copy `vec_from` into an independent column before each squaring
step and merge the copy into `vec_from`, so source and destination no
longer alias.
Reproduction (Address Sanitizer): caught by master fuzzer
(`Stress test (experimental, serverfuzz, arm_asan_ubsan, s3)`,
STID 0988-40af, sha 20f4b50):
SELECT finalizeAggregation(state * 4)
FROM (SELECT quantilesExactState(0.5)(number) AS state FROM numbers(50));
This deterministically triggers the UAF on the previous code and
passes with the fix.
CI report:
https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=20f4b50bb190f87122f53eee1f99887ec9d8d771&name_0=MasterCI&name_1=Stress%20test%20%28experimental%2C%20serverfuzz%2C%20arm_asan_ubsan%2C%20s3%29
|
cc @rschu1ze @alexey-milovidov — could you review this? It fixes a heap-use-after-free in the exponentiation-by-squaring path of |
Pre-PR validation gate (TASK.md Phase 4 step 9)
Session: |
|
Workflow [PR], commit [a29f7c2] Summary: ✅ AI ReviewSummaryThis PR fixes self-merge undefined behavior in Final VerdictStatus: ✅ Approve |
alexey-milovidov
left a comment
There was a problem hiding this comment.
Yes, this is a good fix.
|
@groeneai, please send an additional PR that will add an assert into IAggregateFunction::merge that it isn't called with the same state. The assert should be in a single place, e.g., it can go into IAggregateFunctionHelper. Or if it's not possible, change the virtual method merge to mergeImpl everywhere and make the merge method to check the assert and call mergeImpl - something on that theme. |
|
@groeneai, the changelog category is not recognized. |
|
@alexey-milovidov thanks for the review!
#103543 will narrow to just the chassert commit once this PR lands. |
This PR's `executeAggregateMultiply` rework copies the aggregate state into an independent column before each squaring step, so source and destination no longer alias on entry to `merge`. For aggregate functions whose `merge` reallocates and then reads the source (e.g. `quantilesExact`, `groupArray`, `groupUniqArray`, `windowFunnel`, `sequenceMatch`), this fixes the heap-use-after-free. `ReservoirSampler::merge` (used by `medianState` / `quantileState`) had been guarded since ClickHouse#79031 with an early-return when `this == &b`. That guard is unreachable from `executeAggregateMultiply` now: the squaring step never feeds the same state as both arguments. Squaring becomes a real merge instead of a no-op, so `state * N` for a ReservoirSampler- backed aggregate now actually accumulates `N` copies of the input rather than collapsing to a single copy. `03406_reservoir_sample_self_merging` was added in ClickHouse#79031 with reference values that match the no-op semantics: 10 * medianState(numbers(10)) -> 4.5 (100 inserts <= 8192, exact) 10 * medianState(numbers(1000)) -> 499.5 (was no-op; same as state) 10 * medianState(numbers(100000)) -> 50501.5 (already over reservoir) Under the corrected semantics the middle case crosses the 8192-sample reservoir cap (10 x 1000 = 10000 inserts) and the deterministic seed yields `495` for the median. The first and third cases are unchanged: the small one stays under the cap, the large one was already saturated. The test still verifies the absence of UAF / `LOGICAL_ERROR` on `state * N` for ReservoirSampler-backed aggregates - it's a regression test for ClickHouse#78299 - and only the boundary value moves. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=103543&sha=f7dd195af45870446b0c48654753c337438e90b2&name_0=PR
Wrap the public `merge` method of `IAggregateFunction` in a non-virtual NVI entry point that asserts `place != rhs` (in debug or sanitizer builds) before forwarding to a new pure-virtual `mergeImpl`. The same hazard motivated PR ClickHouse#103536 (heap-use-after-free in `executeAggregateMultiply` self-merge, STID 0988-40af). For aggregate functions whose `merge` reallocates the destination's internal storage and then reads from the source (`quantilesExact`, `groupArray`, `groupUniqArray`, `windowFunnel`, `sequenceMatch`, …), passing the same pointer as both source and destination is undefined. The new `chassert(place != rhs, …)` aborts on this in DEBUG_OR_SANITIZER_BUILD, catching future occurrences of the entire bug class at the call site instead of letting them surface as memory corruption. Same wrapper added for the parallel `merge` overload (with `ThreadPool`) and for the `mergeBatch` / `mergeAndDestroyBatch` paths in `IAggregateFunctionHelper` (which devirtualise the call and so bypass the public wrapper). 71 concrete aggregate-function overrides renamed from `merge` to `mergeImpl`. Callers are unchanged — they still call `merge`, which now performs the check. Closes the follow-up requested by @alexey-milovidov on PR ClickHouse#103536.
|
@alexey-milovidov - thanks for the review. Pushing one more commit to address the regression in Why the value moved. The original test was added in #79031 with reference values that match the no-op semantics. The middle case crosses the 8192-sample reservoir cap ( The test still serves its original purpose - regression coverage for #78299 (no UAF /
Session id: |
|
[2026-05-10] Recurrence on master + chain block — quick check-in. The bug this PR fixes hit master again today: STID 0988-3351 (AST fuzzer, arm_asan_ubsan, commit This PR is approved and all AST fuzzer + build checks are green. The only failing required check is cc @alexey-milovidov — could you either (a) merge this despite the unrelated function_prop_fuzzer failure (the bug is genuinely recurring on master and |
Master AST fuzzer hit `Segmentation fault (STID: 0988-2212)` on `AST fuzzer (amd_msan)` for `groupArrayMovingSum` on PR ClickHouse#100394 and on `AST fuzzer (arm_asan_ubsan)` / `amd_debug` for PRs ClickHouse#101033, ClickHouse#101062, ClickHouse#104531 — same bug class as STID 0988-40af / 0988-3351. The fuzzer mutated a query into `groupArrayMovingSumState(...) * N`, which goes through `FunctionBinaryArithmetic::executeAggregateMultiply`. The exponentiation-by-squaring loop called `function->merge(vec_from[i], vec_from[i], arena.get())`, and `MovingImpl::merge` then did `cur_elems.value.insert(rhs_elems.value.begin(), rhs_elems.value.end(), arena)` where `rhs_elems` aliased `cur_elems.value`, so the source iterators referenced the buffer that `insertPrepare` was about to reallocate. The fix in this PR (copy `vec_from` into an independent column before each squaring step) eliminates the self-alias at the call site, so `MovingImpl::merge` is never invoked with `place == rhs` from `executeAggregateMultiply` anymore. The new test exercises every `MovingImpl` instantiation reachable from `groupArrayMovingSum` / `groupArrayMovingAvg`: - integer accumulator, - decimal accumulator (the variant flagged on PR ClickHouse#101062), - windowed parameterisation (`LimitNumElements::value = true`), - the `agg_state_is_const` path through `arrayJoin([state, state])`. Verified locally on Debug build that pre-fix the squaring path aborts with the PODArray `assertNotIntersects` assertion (Server `<Fatal>`), while `state * 1` (no squaring branch) returns `50` — matching the single-state baseline used to derive the reference values. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100394&sha=7309f23dc7ccee494b41416c8fff18c02d0697a7&name_0=PR&name_1=AST%20fuzzer%20%28amd_msan%29
|
Additional evidence — the bug class flagged by @alexey-milovidov on #100394 (STID 0988-2212, I just pushed
Reproduction is deterministic on master Debug build: SELECT length(finalizeAggregation(state * 2))
FROM (SELECT groupArrayMovingSumState(number) AS state FROM numbers(50));aborts on the |
`executeAggregateMultiply` previously constructed `column_temp` once outside the squaring loop and reset it between iterations with `popBack(column_temp->size())`. `ColumnAggregateFunction::popBack` destroys the popped aggregate states but does not release the column's arena memory (the `Arenas` list is monotonic and is only freed when the column itself is destroyed). Each squaring iteration's `insertFrom` therefore allocated new state memory in the same arena, causing the temporary footprint to grow with the multiplier instead of staying bounded by one copy of the input column. Construct `column_temp` inside the squaring branch so the column (and its arena) is destroyed at the end of each iteration. Peak temporary footprint is now bounded by `size * state_size`. Caught by the clickhouse-gh[bot] inline review on ClickHouse#103536 .
LLVM Coverage Report
Changed lines: 100.00% (12/12) · Uncovered code |
Cherry pick #103536 to 25.8: Fix heap-use-after-free in `executeAggregateMultiply` self-merge
…eMultiply` self-merge
Cherry pick #103536 to 26.2: Fix heap-use-after-free in `executeAggregateMultiply` self-merge
…eMultiply` self-merge
Cherry pick #103536 to 26.3: Fix heap-use-after-free in `executeAggregateMultiply` self-merge
…eMultiply` self-merge
Cherry pick #103536 to 26.4: Fix heap-use-after-free in `executeAggregateMultiply` self-merge
…eMultiply` self-merge
…eMultiply` self-merge (#105212) Co-authored-by: robot-clickhouse <robot-clickhouse@users.noreply.github.com>
…eMultiply` self-merge (#105218) Co-authored-by: robot-clickhouse <robot-clickhouse@users.noreply.github.com>
Backport #103536 to 26.3: Fix heap-use-after-free in `executeAggregateMultiply` self-merge
Backport #103536 to 26.2: Fix heap-use-after-free in `executeAggregateMultiply` self-merge
Wrap the public `merge` method of `IAggregateFunction` in a non-virtual NVI entry point that asserts `place != rhs` (in debug or sanitizer builds) before forwarding to a new pure-virtual `mergeImpl`. The same hazard motivated PR ClickHouse#103536 (heap-use-after-free in `executeAggregateMultiply` self-merge, STID 0988-40af). For aggregate functions whose `merge` reallocates the destination's internal storage and then reads from the source (`quantilesExact`, `groupArray`, `groupUniqArray`, `windowFunnel`, `sequenceMatch`, …), passing the same pointer as both source and destination is undefined. The new `chassert(place != rhs, …)` aborts on this in DEBUG_OR_SANITIZER_BUILD, catching future occurrences of the entire bug class at the call site instead of letting them surface as memory corruption. Same wrapper added for the parallel `merge` overload (with `ThreadPool`) and for the `mergeBatch` / `mergeAndDestroyBatch` paths in `IAggregateFunctionHelper` (which devirtualise the call and so bypass the public wrapper). 71 concrete aggregate-function overrides renamed from `merge` to `mergeImpl`. Callers are unchanged — they still call `merge`, which now performs the check. Closes the follow-up requested by @alexey-milovidov on PR ClickHouse#103536.
…as-assert-on-103536 Add IAggregateFunction::merge self-aliasing chassert (follow-up to #103536)

Motivation
Master fuzzer caught a heap-use-after-free in
FunctionBinaryArithmetic::executeAggregateMultiply(STID 0988-40af) onStress test (experimental, serverfuzz, arm_asan_ubsan, s3), sha20f4b50bb190f87122f53eee1f99887ec9d8d771:CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=20f4b50bb190f87122f53eee1f99887ec9d8d771&name_0=MasterCI&name_1=Stress%20test%20%28experimental%2C%20serverfuzz%2C%20arm_asan_ubsan%2C%20s3%29
Root cause
The exponentiation-by-squaring loop in
executeAggregateMultiplydoubled the running aggregate state by callingfunction->merge(vec_from[i], vec_from[i], arena.get());passing the same state as both source and destination. For aggregate functions whose
mergeimplementation appends the source's internal storage into the destination's (QuantileExact,groupArray,groupUniqArray,windowFunnel,sequenceMatch, etc.), the destination'sPODArray::insert(begin, end)reallocates mid-merge, and the source iterators (referencing the same, now-freed buffer) are read by the trailingmemcpyinPODArray::insert_assume_reserved.The hazard was already documented in
src/Common/PODArray.h:482:Fix
Copy
vec_frominto an independent column before each squaring step and merge the copy intovec_from. Source and destination no longer alias.Reproduction (deterministic, ASan)
Without the fix this aborts every run with a heap-use-after-free; with the fix the regression test (
tests/queries/0_stateless/04117_aggregate_state_multiply_self_merge_uaf) passes 50/50 ASan iterations.Pre-PR validation
quantilesExactState(0.5)(number) FROM numbers(50) * N(N >= 2) UAF every time on ASan.func->merge(state, state)aliases src=dst;PODArray::insert(begin,end)reallocates andinsert_assume_reserved'smemcpythen reads from the freed source buffer.vec_fromto an independentcolumn_tempbefore each squaring step.04117_aggregate_state_multiply_self_merge_uafexercises the squaring path across multipliers (1,2,3,4,5,7,8,16,256),quantilesExactState(the aggregate the fuzzer caught),groupArrayState(length grows), and theagg_state_is_constbranch. No existing test weakened.executeAggregateMultiplyitself, covering every aggregate function. Sibling paths (executeAggregateAddition,executeDateTime64Subtraction) take two distinct sources and were verified to be safe.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Fix heap-use-after-free in multiplying an
AggregateFunctionstate by an integer (e.g.quantilesExactState(...) * N). The exponentiation-by-squaring loop merged the state with itself, which is undefined when the aggregate function'smergereallocates its internal storage. Closes STID 0988-40af.Documentation entry for user-facing changes
Version info
26.5.1.65226.4.3.32,26.3.11.29,26.2.19.38,25.8.24.19