Per-CPU memory budget for max_untracked_memory to avoid OOMs#104125
Per-CPU memory budget for max_untracked_memory to avoid OOMs#104125azat wants to merge 24 commits into
Conversation
e67b455 to
7b51042
Compare
7b51042 to
d1f83f5
Compare
d1f83f5 to
db88877
Compare
db88877 to
79d9201
Compare
79d9201 to
92f3808
Compare
92f3808 to
cc970ba
Compare
cc970ba to
a4b20ac
Compare
|
Cloud Benchmarks reports showed some degradation in the benches https://c.house/eo5XwkUdhU |
a4b20ac to
9ee95be
Compare
Introduces per-CPU memory budget - a per-CPU slice that mirrors the existing per-thread `untracked_memory` accumulator on a per-CPU so `Σ untracked_memory` is bounded by O(ncpu) instead of O(nthreads). `total_memory_tracker` therefore becomes a true upper bound on real memory usage, which should prevent OOMs (real one from kernel) due to untracked_memory. rseq is used for better performance Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@azat Cloud Benchmarks report shows a big regression https://c.house/eo5XwkUdhU |
15fbfe5 to
383a0c0
Compare
1be78d4 to
bbde0bf
Compare
| { | ||
| cpu = static_cast<int>(rseq_cpu_start()); | ||
| if (unlikely(static_cast<unsigned>(cpu) >= static_cast<unsigned>(slot_count))) | ||
| return false; |
There was a problem hiding this comment.
rseq_ready is true but rseq_cpu_start returns an invalid CPU id (-1/-2), this branch returns false immediately.
At that point chargeAlloc / chargeFree have already zeroed state.pending_alloc / state.pending_free, so this chunk is dropped from per-CPU accounting instead of falling back to the sched_getcpu + atomic path.
That weakens the budget exactly in the per-thread-unavailable-rseq case. Can we fall back to the non-rseq path here instead of returning early?
LLVM Coverage Report
Changed lines: 96.10% (197/205) · Uncovered code |

Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Per-CPU memory budget for max_untracked_memory to avoid OOMs
TBD