Add dynamic untracked memory limits for more precise memory tracking by serxa · Pull Request #64423 · ClickHouse/ClickHouse · GitHub
Skip to content

Add dynamic untracked memory limits for more precise memory tracking#64423

Merged
serxa merged 22 commits into
masterfrom
precise-memtracking
Jun 6, 2024
Merged

Add dynamic untracked memory limits for more precise memory tracking#64423
serxa merged 22 commits into
masterfrom
precise-memtracking

Conversation

@serxa

@serxa serxa commented May 26, 2024

Copy link
Copy Markdown
Member

The current memory tracker has an error of up to untracked_memory_limit = 4MB per thread. Consider a ClickHouse server running with 8GB RAM and a standard max_server_memory_usage_to_ram_ratio = 0.9 server setting. If there are 1000 threads (and every thread has 800KB on average) OOM can happen before any query is stopped with MEMORY_LIMIT_EXCEEDED.

We need total untracked memory to be proportional to current memory usage regardless of the number of threads. To make it fit into 10% RAM we want an error to be less than e.g. 6%. To achieve it we need untracked_memory_limit to be updated dynamically and be proportional to current thread memory usage for threads with small memory footprint:
untracked_memory_limit = clamp(already_tracked / 16, 4KB, 4MB).

Changelog category (leave one):

  • Not for changelog

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

This PR was reverted

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

CI Settings

NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Checked options will be applied if set before CI RunConfig/PrepareRunConfig step

  • Allow: Integration Tests
  • Allow: Stateless tests
  • Allow: Stateful tests
  • Allow: Unit tests
  • Allow: Performance tests
  • Allow: All with aarch64
  • Allow: All with ASAN
  • Allow: All with TSAN
  • Allow: All with Analyzer
  • Allow: All with Azure
  • Allow: Add your option here

  • Exclude: Fast test
  • Exclude: Integration Tests
  • Exclude: Stateless tests
  • Exclude: Stateful tests
  • Exclude: Performance tests
  • Exclude: All with ASAN
  • Exclude: All with TSAN
  • Exclude: All with MSAN
  • Exclude: All with UBSAN
  • Exclude: All with Coverage
  • Exclude: All with Aarch64

  • do not test (only style check)
  • disable merge-commit (no merge from master before tests)
  • disable CI cache (job reuse)
  • allow: batch 1 for multi-batch jobs
  • allow: batch 2
  • allow: batch 3
  • allow: batch 4, 5 and 6

@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-feature Pull request with new product feature label May 26, 2024
@robot-clickhouse-ci-1

robot-clickhouse-ci-1 commented May 26, 2024

Copy link
Copy Markdown
Contributor

@serxa serxa marked this pull request as ready for review May 31, 2024 08:47
@divanik divanik self-assigned this May 31, 2024
@serxa

serxa commented May 31, 2024

Copy link
Copy Markdown
Member Author

I'm not sure what's wrong with the Upgrade check. Does anyone have an idea? I've added a new setting to the history and it does not show up, but the check is still failing.

Changed settings are not reflected in settings changes history (see changed_settings.txt) - FAIL

   ┌─name─────────────────────────────────────┬─new_value─┬─old_value─┐
1. │ enable_vertical_final                    │ 0         │ 1         │
2. │ output_format_parquet_use_custom_encoder │ 1         │ 0         │
   └──────────────────────────────────────────┴───────────┴───────────┘

New settings are not reflected in settings changes history (see new_settings.txt) - FAIL

    ┌─name──────────────────────────────────────────┐
 1. │ azure_max_blocks_in_multipart_upload          │
 2. │ allow_experimental_join_condition             │
 3. │ prefer_external_sort_block_bytes              │
 4. │ cross_join_min_rows_to_compress               │
 5. │ cross_join_min_bytes_to_compress              │
 6. │ cast_string_to_dynamic_use_inference          │
 7. │ allow_experimental_dynamic_type               │
 8. │ allow_deprecated_error_prone_window_functions │
 9. │ input_format_force_null_for_omitted_fields    │
10. │ input_format_tsv_crlf_end_of_line             │
    └───────────────────────────────────────────────┘

@serxa

serxa commented Jun 3, 2024

Copy link
Copy Markdown
Member Author

Okay, merge master and CI tests rerun fixed the problem. Tests 01676_clickhouse_client_autocomplete and 02456_progress_tty seem to be flaky. I think this PR is now ready to be merged.

@serxa serxa requested a review from divanik June 3, 2024 13:30
Comment thread src/Core/SettingsChangesHistory.h Outdated
Comment thread src/Common/CurrentMemoryTracker.cpp
Comment thread tests/queries/0_stateless/01017_uniqCombined_memory_usage.sql Outdated
Comment thread tests/integration/test_failed_async_inserts/test.py Outdated
Comment thread tests/integration/test_settings_constraints_distributed/test.py Outdated
@serxa serxa enabled auto-merge June 6, 2024 12:57
@serxa serxa added this pull request to the merge queue Jun 6, 2024
Merged via the queue into master with commit 9ec1306 Jun 6, 2024
@serxa serxa deleted the precise-memtracking branch June 6, 2024 19:02
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jun 6, 2024
baibaichen added a commit to Kyligence/gluten that referenced this pull request Jun 7, 2024
@Algunenano

Copy link
Copy Markdown
Member

@Algunenano

Copy link
Copy Markdown
Member

baibaichen added a commit to Kyligence/ClickHouse that referenced this pull request Jun 7, 2024
baibaichen added a commit to Kyligence/gluten that referenced this pull request Jun 7, 2024
azat added a commit to azat/ClickHouse that referenced this pull request Mar 31, 2026
The current memory tracker has an error of up to `untracked_memory_limit
= 4MB` per thread. Consider a ClickHouse server running with 8GB RAM and
a standard `max_server_memory_usage_to_ram_ratio = 0.9` server setting.
If there are 1000 threads (and every thread has 800KB on average) OOM
can happen before any query is stopped with `MEMORY_LIMIT_EXCEEDED`.

We need total untracked memory to be proportional to current memory
usage regardless of the number of threads. To make it fit into 10% RAM
we want an error to be less than e.g. 6%. To achieve it we need
`untracked_memory_limit` to be updated dynamically and be proportional
to current thread memory usage for threads with small memory footprint:
`untracked_memory_limit = clamp(already_tracked / 16, 4KB, 4MB)`

Resubmit of: ClickHouse#64423
Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>
azat added a commit to azat/ClickHouse that referenced this pull request Apr 18, 2026
The current memory tracker has an error of up to `untracked_memory_limit
= 4MB` per thread. Consider a ClickHouse server running with 8GB RAM and
a standard `max_server_memory_usage_to_ram_ratio = 0.9` server setting.
If there are 1000 threads (and every thread has 800KB on average) OOM
can happen before any query is stopped with `MEMORY_LIMIT_EXCEEDED`.

We need total untracked memory to be proportional to current memory
usage regardless of the number of threads. To make it fit into 10% RAM
we want an error to be less than e.g. 6%. To achieve it we need
`untracked_memory_limit` to be updated dynamically and be proportional
to current thread memory usage for threads with small memory footprint:
`untracked_memory_limit = clamp(already_tracked / 16, 4KB, 4MB)`

Resubmit of: ClickHouse#64423
Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>
azat added a commit to azat/ClickHouse that referenced this pull request Apr 22, 2026
The current memory tracker has an error of up to `untracked_memory_limit
= 4MB` per thread. Consider a ClickHouse server running with 8GB RAM and
a standard `max_server_memory_usage_to_ram_ratio = 0.9` server setting.
If there are 1000 threads (and every thread has 800KB on average) OOM
can happen before any query is stopped with `MEMORY_LIMIT_EXCEEDED`.

We need total untracked memory to be proportional to current memory
usage regardless of the number of threads. To make it fit into 10% RAM
we want an error to be less than e.g. 6%. To achieve it we need
`untracked_memory_limit` to be updated dynamically and be proportional
to current thread memory usage for threads with small memory footprint:
`untracked_memory_limit = clamp(already_tracked / 16, 4KB, 4MB)`

Resubmit of: ClickHouse#64423
Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>
azat added a commit to azat/ClickHouse that referenced this pull request Apr 22, 2026
The current memory tracker has an error of up to `untracked_memory_limit
= 4MB` per thread. Consider a ClickHouse server running with 8GB RAM and
a standard `max_server_memory_usage_to_ram_ratio = 0.9` server setting.
If there are 1000 threads (and every thread has 800KB on average) OOM
can happen before any query is stopped with `MEMORY_LIMIT_EXCEEDED`.

We need total untracked memory to be proportional to current memory
usage regardless of the number of threads. To make it fit into 10% RAM
we want an error to be less than e.g. 6%. To achieve it we need
`untracked_memory_limit` to be updated dynamically and be proportional
to current thread memory usage for threads with small memory footprint:
`untracked_memory_limit = clamp(already_tracked / 16, 4KB, 4MB)`

Resubmit of: ClickHouse#64423
Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>
azat added a commit to azat/ClickHouse that referenced this pull request Apr 23, 2026
The current memory tracker has an error of up to `untracked_memory_limit
= 4MB` per thread. Consider a ClickHouse server running with 8GB RAM and
a standard `max_server_memory_usage_to_ram_ratio = 0.9` server setting.
If there are 1000 threads (and every thread has 800KB on average) OOM
can happen before any query is stopped with `MEMORY_LIMIT_EXCEEDED`.

We need total untracked memory to be proportional to current memory
usage regardless of the number of threads. To make it fit into 10% RAM
we want an error to be less than e.g. 6%. To achieve it we need
`untracked_memory_limit` to be updated dynamically and be proportional
to current thread memory usage for threads with small memory footprint:
`untracked_memory_limit = clamp(already_tracked / 16, 4KB, 4MB)`

Resubmit of: ClickHouse#64423
Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants