Add dynamic untracked memory limits for more precise memory tracking by azat · Pull Request #101251 · ClickHouse/ClickHouse · GitHub
Skip to content

Add dynamic untracked memory limits for more precise memory tracking#101251

Closed
azat wants to merge 4 commits into
ClickHouse:masterfrom
azat:precise-memtracking
Closed

Add dynamic untracked memory limits for more precise memory tracking#101251
azat wants to merge 4 commits into
ClickHouse:masterfrom
azat:precise-memtracking

Conversation

@azat

@azat azat commented Mar 30, 2026

Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Add dynamic untracked memory limits for more precise memory tracking

Original comment from @serxa:

The current memory tracker has an error of up to untracked_memory_limit = 4MB per thread. Consider a ClickHouse server running with 8GB RAM and a standard max_server_memory_usage_to_ram_ratio = 0.9 server setting. If there are 1000 threads (and every thread has 800KB on average) OOM can happen before any query is stopped with MEMORY_LIMIT_EXCEEDED.

We need total untracked memory to be proportional to current memory usage regardless of the number of threads. To make it fit into 10% RAM we want an error to be less than e.g. 6%. To achieve it we need untracked_memory_limit to be updated dynamically and be proportional to current thread memory usage for threads with small memory footprint: untracked_memory_limit = clamp(already_tracked / 16, 4KB, 4MB)

Resubmit of: #64423 (by @serxa)

Note: marked as Performance Improvement for run perf tests and memory accounting can be also "performance imporvement"

@azat azat requested a review from serxa March 30, 2026 16:47
@clickhouse-gh

clickhouse-gh Bot commented Mar 30, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-improvement Pull request with some product improvements label Mar 30, 2026
@azat azat added the memory When memory usage is higher than expected label Mar 30, 2026
@azat azat force-pushed the precise-memtracking branch 2 times, most recently from 9d23adc to f1f1f1b Compare April 1, 2026 09:23
@alexey-milovidov

Copy link
Copy Markdown
Member

The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix.

@alexey-milovidov

Copy link
Copy Markdown
Member

The MSan stress test failure (MemorySanitizer: use-of-uninitialized-value, STID 4179-5154 or 4148-3044) is a known pre-existing issue unrelated to this PR. Fix: #102158

@alexey-milovidov

Copy link
Copy Markdown
Member

The Can't adjust last granule error in CI is a known issue. The fix is in #101641

@azat

azat commented Apr 17, 2026

Copy link
Copy Markdown
Member Author

@serxa PTAL

@serxa serxa self-assigned this Apr 17, 2026
@azat

azat commented Apr 18, 2026

Copy link
Copy Markdown
Member Author

Here is also a way for untracked_memory introspection - #103065

@azat azat force-pushed the precise-memtracking branch from 8ad391b to b5433a6 Compare April 18, 2026 22:07
@azat

azat commented Apr 22, 2026

Copy link
Copy Markdown
Member Author

I've looked at performance regressions:

  • group_by_multiple_strings.5 - select key1_8, key2_8, sum(value) from test_group_by_strings group by key1_8, key2_8 settings max_threads = 32 - it is just enters MemoryTracker more often, because this query before uses less then 4MiB (peak was 399840), so before this patch it never enters MemoryTracker, now it enters few times, we can add metric I guess (since if we enter MemoryTracker, it should not be that a big deal)
  • functions_geo.1 - SELECT count() FROM numbers(1000000) WHERE NOT ignore(geohashDecode(toString(number % 1000000))) - same (peak on some thread was 131335)

So I think this is acceptable, I'm going to add profile event (hope it will not slows down even more)

Note: I've tried to add metrics for all allocations, and it slows down - #85545, hence only when MemoryTracker is called

@azat azat force-pushed the precise-memtracking branch from b5433a6 to aa50660 Compare April 22, 2026 16:10
@clickhouse-gh clickhouse-gh Bot added pr-performance Pull request with some performance improvements and removed pr-improvement Pull request with some product improvements labels Apr 22, 2026
Comment thread src/Common/CurrentMemoryTracker.cpp
Comment thread src/Common/ProfileEvents.cpp Outdated
@azat azat force-pushed the precise-memtracking branch 3 times, most recently from 7efdb6c to 10b62fe Compare April 22, 2026 21:51
Comment thread src/Common/CurrentMemoryTracker.cpp
serxa and others added 3 commits April 23, 2026 10:23
The current memory tracker has an error of up to `untracked_memory_limit
= 4MB` per thread. Consider a ClickHouse server running with 8GB RAM and
a standard `max_server_memory_usage_to_ram_ratio = 0.9` server setting.
If there are 1000 threads (and every thread has 800KB on average) OOM
can happen before any query is stopped with `MEMORY_LIMIT_EXCEEDED`.

We need total untracked memory to be proportional to current memory
usage regardless of the number of threads. To make it fit into 10% RAM
we want an error to be less than e.g. 6%. To achieve it we need
`untracked_memory_limit` to be updated dynamically and be proportional
to current thread memory usage for threads with small memory footprint:
`untracked_memory_limit = clamp(already_tracked / 16, 4KB, 4MB)`

Resubmit of: ClickHouse#64423
Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>
@azat azat force-pushed the precise-memtracking branch from 10b62fe to 1e19934 Compare April 23, 2026 08:24
if (current_thread->untracked_memory < -current_thread->untracked_memory_limit)
/// Use `max_untracked_memory` (not `untracked_memory_limit`) to create hysteresis and avoid track/untrack cycles
if (current_thread->untracked_memory < -current_thread->max_untracked_memory)
{

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using max_untracked_memory as the flush threshold for deallocations can permanently overcount memory after short-lived spikes.

Concrete trace:

  1. min_untracked_memory = 4KiB, max_untracked_memory = 4MiB.
  2. Allocate 1MiB -> allocation flushes (because current limit is small), so memory_tracker increases by 1MiB.
  3. Free 1MiB -> untracked_memory = -1MiB, but this branch does not flush because -1MiB > -max_untracked_memory.
  4. Next alloc/free cycles of 1MiB keep untracked_memory oscillating between 0 and -1MiB, so the previously tracked 1MiB is never released to parent trackers.

This means query/global trackers can stay overestimated by up to ~max_untracked_memory per thread for a long time, which can trigger premature MEMORY_LIMIT_EXCEEDED under concurrency. I think the deallocation flush condition should stay tied to the current dynamic limit (or another dynamic lower bound), not a fixed max_untracked_memory.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, valid point

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think initially different bounds for alloc/dealloc were introduced to avoid too many tracker invocations in corner cases, but now it seems to be unavoidable when I think about it

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought about this for awhile and decided to try this option, plus, I don't see too much benefits - https://pastila.nl/?009ef774/015773a63df85587f4c9889ec3f9186f#0Ruws2sMXI1d4WKqcrjt0A==GCM

And also maybe we can speedup MemoryTracker - #103464

@@ -0,0 +1,3 @@
profiles:
default:
min_untracked_memory: 4Mi

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All updated tests in this PR pin min_untracked_memory back to 4Mi, so the new default dynamic behavior (4KiB lower bound) is not covered.

Could we add at least one test that keeps defaults and verifies the new path (reduced untracked error / earlier memory-limit reaction) to protect against regressions?

@clickhouse-gh

clickhouse-gh Bot commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 83.80% 83.80% +0.00%
Functions 91.10% 91.10% +0.00%
Branches 76.20% 76.20% +0.00%

Changed lines: 100.00% (62/62) · Uncovered code

Full report · Diff report

@vadimskipin

Copy link
Copy Markdown
Collaborator

Take a look at this code: https://github.com/ClickHouse/vskipin-experiments/blob/master/src/util/memory-tracker.h
I believe it resolves the problem with untracked memory completely.

@azat azat marked this pull request as draft May 4, 2026 15:19
@azat

azat commented May 28, 2026

Copy link
Copy Markdown
Member Author

Here is an alternative - #106055

@azat

azat commented Jul 3, 2026

Copy link
Copy Markdown
Member Author

Alternative has been merged and it looks better

@azat azat closed this Jul 3, 2026
@azat azat deleted the precise-memtracking branch July 3, 2026 13:31
@serxa

serxa commented Jul 3, 2026

Copy link
Copy Markdown
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

memory When memory usage is higher than expected pr-performance Pull request with some performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants