Change max_insert_threads default to auto (number of CPU cores)#109006
Change max_insert_threads default to auto (number of CPU cores)#109006alexey-milovidov wants to merge 6 commits into
Conversation
Previously `max_insert_threads` defaulted to `0`, which was treated as `1` (no parallel execution) for `INSERT SELECT`. Change its type to `MaxThreads` so the default `0` resolves to `auto` - the number of CPU cores available to the server (the same auto value as `max_threads`). The number of insert threads is still capped at query time by the available free memory via the existing `max_insert_threads_min_free_memory_per_thread` (4 GiB) limiter, and never exceeds `max_threads`. This parallelizes `INSERT SELECT` by default. To restore the previous single-threaded behavior, set `max_insert_threads` to `1` or set the `compatibility` setting to a version below `26.7`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… SELECT With the new default `max_insert_threads = auto`, `INSERT SELECT` runs in parallel by default, which changes these tests: - `03622_explain_indexes_distributed_index_analysis`: parallel insert produces a different number of parts, so the distributed index analysis reports a different number of granules per replica. - `03217_primary_index_memory_leak`: parallel insert multiplies the peak memory across insert threads, exceeding the test's `max_memory_usage = '100M'` and raising `MEMORY_LIMIT_EXCEEDED`. Both tests intentionally exercise single-threaded insert behavior, so pin `max_insert_threads = 1` to restore the deterministic single-threaded path; their references are unchanged. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=109006&sha=4eecd7967735a331e1968f519c648c6047cceb93&name_0=PR&name_1=Fast%20test Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`max_insert_threads` is now of type `MaxThreads`, so an explicit `max_insert_threads = 0` is stringified as the machine-dependent `auto(N)` in `system.query_log` (`SettingsImpl::dumpToMapColumn` writes `getValueString`, and `SettingFieldMaxThreads::toString` returns `auto(N)` for the auto state). `01275_parallel_mv` and `03403_mv_errors` select `Settings['max_insert_threads']` from `system.query_log` and their references expected the literal `0`, so the rows became non-deterministic `auto(N)` strings (which also reorder under `ORDER BY ALL`). Iterate over `max_insert_threads in [1, 5]` instead of `[0, 5]`. This keeps the non-parallel (`1`) and parallel (`5`) coverage the tests care about, while recording stable values. The previous default (`0`) was already treated as a single thread, so replacing it with an explicit `1` produces identical execution - only the recorded label changes from `0` to `1`, so the references are updated in place with no reordering. The auto default itself is covered by `04492_max_insert_threads_auto_default`. Addresses the review comment on `src/Core/Settings.cpp`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| When disabled, non-joined rows are processed by a single thread. | ||
| )", 0) \ | ||
| DECLARE(UInt64, max_insert_threads, 0, R"( | ||
| DECLARE(MaxThreads, max_insert_threads, 0, R"( |
There was a problem hiding this comment.
Changing max_insert_threads to MaxThreads also changes the value we send over the native TCP query-settings path. For changed settings, Connection::sendQuery uses SettingsWriteFormat::STRINGS_WITH_FLAGS, and BaseSettings::write serializes field.getValueString(), so a query or profile with max_insert_threads = 0 is sent as auto(N). During a rolling upgrade, older shards still define this setting as UInt64 and will try to parse that string as an integer, rejecting a setting value that was legal before the upgrade.
Please keep the wire value numeric for peers that can still have the old UInt64 setting, for example by avoiding the field-type change here or by adding a revision-aware compatibility path that sends 0 to older servers.
There was a problem hiding this comment.
Confirmed the mechanism and traced both ends, so we can scope this precisely before deciding.
When the auto value actually reaches the wire. Settings::write iterates allChanged() (BaseSettings::begin() → allChanged().begin()), so only changed settings are serialized. The new default (max_insert_threads = 0, unchanged) is therefore never sent — a query that just uses the new default transmits nothing for this setting, and the older shard falls back to its own default. The rolling-upgrade path for the feature itself is safe.
The value only goes on the wire when the setting is explicitly changed — e.g. a user or profile writes SET max_insert_threads = 0 / = 'auto'. In that case, for server_revision >= DBMS_MIN_REVISION_WITH_SETTINGS_SERIALIZED_AS_STRINGS, Connection::sendQuery uses SettingsWriteFormat::STRINGS_WITH_FLAGS, BaseSettings::write emits field.getValueString(), and SettingFieldMaxThreads::toString yields auto(N). On a pre-26.7 peer max_insert_threads is still UInt64, so BaseSettings::read takes the known-setting branch (setValueString → SettingFieldUInt64::parseFromString("auto(N)")) and throws — the lenient warningSettingNotFound path is only reached for settings the receiver doesn't know. So the concern is real, but its blast radius is exactly: an explicit max_insert_threads = 0/'auto' forwarded to a pre-26.7 server during the rolling-upgrade window. (An explicit positive value still stringifies numerically and is fine.)
Prior art. This is the same structural tradeoff every UInt64 → MaxThreads/SettingAutoWrapper migration carries; max_threads has behaved this way for years. The difference here is only that a version exists where max_insert_threads was UInt64, so auto(N) is unparseable to it — whereas max_threads was MaxThreads on both ends.
Decision for @alexey-milovidov (your call as author of this Backward Incompatible Change):
- Accept and document it — the affected case is narrow (explicit
=0/auto, transient upgrade window, and0was previously just an alias for single-threaded that users rarely set explicitly), matching how earlier auto migrations landed; or - Add a revision-aware numeric path so explicit
0/autois sent as0to peers below the 26.7 revision. That touches the settings wire path (getValueStringhas no peer-revision context today), so it's a larger, separately-tested change rather than a drive-by.
I did not revert the field type (that would remove the feature) or add a compatibility path unilaterally, since the choice between (1) and (2) is a design decision. Leaving this thread open for you.
The stateless-test settings randomizer injects an explicit `--max_insert_threads N` (randint(1, 3)) on every run. That marks `max_insert_threads` as changed, so the `SETTINGS compatibility = '26.6'` query in `04492_max_insert_threads_auto_default` can no longer reset the value to its pre-26.7 default of `1` (the `compatibility` mechanism only applies to unchanged settings). The fourth assertion then observed the randomized `2`/`3` instead of `1`, failing every stateless job that runs with randomization (both the parallel jobs and the flaky check). The CI harness confirmed the diagnosis directly: "Passes without randomization. Confirmed: the failure is caused by randomized settings." The test verifies the compile-time default and the `compatibility`-derived value of the very setting the randomizer overrides, so `no-random-settings` is strictly necessary here rather than avoidable. CI report: #109006 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The sole hard red here — It is the same class @groeneai is allowlisting in #109042, but that PR's regex currently matches only the Everything on this PR's own changes is green; the remaining checks are still in progress. |
| When disabled, non-joined rows are processed by a single thread. | ||
| )", 0) \ | ||
| DECLARE(UInt64, max_insert_threads, 0, R"( | ||
| DECLARE(MaxThreads, max_insert_threads, 0, R"( |
There was a problem hiding this comment.
Changing max_insert_threads to MaxThreads also breaks the Server::Hello settings packet for mixed-version clients. TCPHandler sends changed session/profile settings with SettingsWriteFormat::STRINGS_WITH_FLAGS, so a server-side max_insert_threads = 0 now goes out as auto(N). A 26.6 client still reads this setting as UInt64 in Connection::receiveHello, so settings.read throws before the session starts. Because apply_settings_from_server is enabled by default, a rolling upgrade can lock out older clients as soon as a profile keeps the previously-valid 0 value.
Please preserve the on-wire value 0 for older peers/clients, or avoid changing the stored setting type here.
|
I have joined nothing to to this table but everything that im trying
leaning
…On Thu, Jul 2, 2026, 9:30 PM clickhouse-gh[bot] ***@***.***> wrote:
***@***.***[bot]* commented on this pull request.
------------------------------
In src/Core/Settings.cpp
<#109006 (comment)>
:
> @@ -269,21 +269,24 @@ Allow multiple threads to process non-joined rows from the right table in parall
This can speed up the non-joined phase when using the `parallel_hash` join algorithm with large tables.
When disabled, non-joined rows are processed by a single thread.
)", 0) \
- DECLARE(UInt64, max_insert_threads, 0, R"(
+ DECLARE(MaxThreads, max_insert_threads, 0, R"(
Changing max_insert_threads to MaxThreads also breaks the Server::Hello
settings packet for mixed-version clients. TCPHandler sends changed
session/profile settings with SettingsWriteFormat::STRINGS_WITH_FLAGS, so
a server-side max_insert_threads = 0 now goes out as auto(N). A 26.6
client still reads this setting as UInt64 in Connection::receiveHello, so
settings.read throws before the session starts. Because
apply_settings_from_server is enabled by default, a rolling upgrade can
lock out older clients as soon as a profile keeps the previously-valid 0
value.
Please preserve the on-wire value 0 for older peers/clients, or avoid
changing the stored setting type here.
—
Reply to this email directly, view it on GitHub
<#109006?email_source=notifications&email_token=CAE37U22XZDXFSGEFR4SAJD5C4LEHA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTINRSGIZTGNZTHE2KM4TFMFZW63VKON2WE43DOJUWEZLEUVSXMZLOOSWGM33PORSXEX3DNRUWG2Y#pullrequestreview-4622337394>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/CAE37U2IGHR3OGX6LWCJRDL5C4LEHAVCNFSNUABEKJSXA33TNF2G64TZHM3DAMRUGYZTKOJ3JFZXG5LFHM2DOOBRHAZDIMZXGGQXMAQ>
.
Triage notifications, keep track of coding agent tasks and review pull
requests on the go with GitHub Mobile for iOS
<https://github.com/notifications/mobile/ios/CAE37UZ6QAWP7RGGX73QS3D5C4LEHA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTINRSGIZTGNZTHE2KM4TFMFZW63VKON2WE43DOJUWEZLEUVSXMZLOOSVGM33PORSXEX3JN5ZQ>
and Android
<https://github.com/notifications/mobile/android/CAE37U3RSHSG7O7XCDCXPMD5C4LEHA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTINRSGIZTGNZTHE2KM4TFMFZW63VKON2WE43DOJUWEZLEUVSXMZLOOSXGM33PORSXEX3BNZSHE33JMQ>.
Download it today!
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
LLVM Coverage Report
Changed lines: Changed C/C++ lines covered: 26/26 (100.00%) · Uncovered code |

Change the default value of the
max_insert_threadssetting from1(no parallel execution) toauto.Motivation
max_insert_threadsdefaulted to0, which was treated as1, soINSERT SELECTran single-threaded in open-source ClickHouse, while ClickHouse Cloud already set it to4for larger nodes. It is time for a more reasonable default.What changed
max_insert_threadsis now of typeMaxThreads, so its default0resolves toauto— the number of CPU cores available to the server (the same auto value asmax_threads), and is shown asauto(N)insystem.settings.max_insert_threads_min_free_memory_per_thread(4 GiB) limiter, and never exceedsmax_threads.0(and unset) now mean auto, consistent withmax_threads. Setmax_insert_threads = 1to runINSERT SELECTin a single thread (for example, to preserve the insertion order).SettingsChangesHistoryentry restores the previous default (1) whencompatibilityis set below26.7.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
The default value of the
max_insert_threadssetting has been changed from1(no parallelism) toauto, which resolves to the number of CPU cores available to the server (reduced under memory pressure viamax_insert_threads_min_free_memory_per_thread). This parallelizesINSERT SELECTby default and can change the number of parts created by such queries and the order of inserted rows. To restore the previous behavior, setmax_insert_threadsto1, or setcompatibilityto a version below26.7.