Fix S3 settings configuration priorities#100975
Conversation
Fix the priority order for S3 settings to be: 1. Global `<s3>` config section (lowest priority) 2. `<storage_configuration>` disk settings 3. Table-level, user profile, and query-level settings (highest priority) Two bugs were present: In `S3ObjectStorage::applyNewSettings`, endpoint settings from the global `<s3>` section were applied after disk config settings, overriding them. Swapped the order so disk config takes priority over global `<s3>`. In `StorageS3` configuration (`fromAST`), endpoint settings from the global `<s3>` section were applied last without re-applying profile/query settings on top. Added `updateFromSettings` after endpoint application so that user/profile/query-level settings retain highest priority. Closes: #50029 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| } | ||
|
|
||
| /// Re-apply user/profile/query-level settings on top, so they take priority over the global <s3> config section. | ||
| s3_settings->request_settings.updateFromSettings(context->getSettingsRef(), /* if_changed */ true); |
There was a problem hiding this comment.
updateFromSettings is called here with the default validate_settings=true, so this path now validates upload settings unconditionally.
That changes behavior when s3_validate_request_settings=0: before this patch, validation was gated by that setting (loadFromConfigForObjectStorage(..., validate_settings) and the final conditional validateUploadSettings), but now invalid s3_* request settings can still throw here even when validation is explicitly disabled.
Please pass the setting explicitly, e.g.:
s3_settings->request_settings.updateFromSettings(
context->getSettingsRef(),
/* if_changed */ true,
context->getSettingsRef()[Setting::s3_validate_request_settings]);…tings` Without this, the re-applied `updateFromSettings` call would unconditionally validate upload settings, breaking the `s3_validate_request_settings = 0` path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verifies that query-level SETTINGS (e.g., `s3_max_single_part_upload_size`) properly override session-level defaults when using the `s3()` table function. This exercises the `fromAST` code path where user/query-level settings must take priority over global `<s3>` endpoint configuration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| modified_settings->request_settings.proxy_resolver = DB::ProxyConfigurationResolverProvider::getFromOldSettingsFormat( | ||
| ProxyConfiguration::protocolFromString(uri.uri.getScheme()), config_prefix, config); | ||
|
|
||
| /// Apply global <s3> endpoint settings first (lowest priority). |
There was a problem hiding this comment.
Thanks for fixing the priority order here. Please also add a regression test for the DiskS3 path (S3ObjectStorage::applyNewSettings) in addition to the new fromAST test.
This PR fixes two independent paths, but 04065_s3_settings_query_override covers only s3 table function parsing. Without a DiskS3/reload test, the storage_configuration vs global <s3> precedence can regress silently.
The test `04065_s3_settings_query_override` was passing on master because no global `<s3>` endpoint config existed with `max_single_part_upload_size` set. Without such config, the bug (endpoint settings overriding query-level settings) could not manifest. Add `tests/config/config.d/s3_settings_override.xml` with a matching endpoint that sets `max_single_part_upload_size` to 100Mi. Now on master the endpoint config overrides the query-level setting (10000) causing single-part upload, while with the fix the query-level setting is properly re-applied after endpoint config. Also remove the now-redundant session-level SET and update comments. https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100975&sha=7a6d5529ced899cb58397d66d36420a5d03fad43&name_0=PR&name_1=Bugfix%20validation%20%28functional%20tests%29 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add `04068_s3_disk_settings_override` test that covers the `S3ObjectStorage::applyNewSettings` code path. The test creates a table on an S3 disk configured with a small `max_single_part_upload_size`, while a global `<s3>` endpoint section sets a large value for the same endpoint. The test verifies that multipart upload is used, confirming that disk config takes priority over global `<s3>` settings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…st config These config files were created but never symlinked in `install.sh`, causing bugfix validation to fail (the test passed on master because the global `<s3>` endpoint config was not loaded). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…efix The `loadFromConfigForObjectStorage` passes `"s3_"` as the setting name prefix when constructing `S3RequestSettings`, so disk config keys must use the `s3_` prefix (e.g. `s3_max_single_part_upload_size`) to be recognized. Without the prefix, the setting was silently ignored and the global `<s3>` endpoint value took precedence, causing the test to output `0 0 0` instead of `1 1 1`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…070` to avoid number conflict Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…le test
The bugfix validation failed because both tests passed on the old binary
("Failed to reproduce the bug"):
- Test `04070_s3_disk_settings_override` passed because
`S3ObjectStorage::applyNewSettings` is only called during config reload,
not during initial disk creation. Adding `SYSTEM RELOAD CONFIG` before
the INSERT triggers `applyNewSettings`, which on the old code applies
the global endpoint's 100 Mi `max_single_part_upload_size` AFTER the
disk config's 10 000 value, overriding it. With the fix, the disk
config takes priority.
- Test `04065_s3_settings_query_override` passed because
`S3ObjectStorage::writeObject` independently re-applies query-level
settings via `updateFromSettings`, compensating for the `fromAST` bug.
Since this test cannot demonstrate the bug through the write path,
remove it. The `fromAST` fix remains correct as defense-in-depth for
code paths that read the settings without the `writeObject` override
(e.g. reads, copies).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| @@ -614,6 +614,12 @@ void S3StorageParsedArguments::fromAST(ASTs & args, ContextPtr context, bool wit | |||
| s3_settings->request_settings.updateIfChanged(endpoint_settings->request_settings); | |||
| } | |||
|
|
|||
| /// Re-apply user/profile/query-level settings on top, so they take priority over the global <s3> config section. | |||
| s3_settings->request_settings.updateFromSettings( | |||
There was a problem hiding this comment.
fromAST precedence path (updateFromSettings re-application), but the only remaining regression test (04070_s3_disk_settings_override) exercises DiskS3/applyNewSettings after SYSTEM RELOAD CONFIG. The earlier fromAST test was removed, so this path now has no direct regression coverage.
Please add a targeted test for the s3(...) parsing path (or another call path that consumes S3StorageParsedArguments request settings without writeObject re-applying settings), so future precedence regressions in Configuration::fromAST are caught.
The `.sql` test was failing in all flaky check configurations because `SYSTEM RELOAD CONFIG` sends "Address already in use" warnings to the client's stderr, and the test runner flags any stderr as a failure. All other tests using `SYSTEM RELOAD CONFIG` are `.sh` files where stderr can be properly filtered. Convert this test to the same pattern, using `|& grep -v -e 'Address already in use'` to suppress the warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s priority Add test `04093_s3_settings_query_override` to cover the `fromAST` code path in `Configuration.cpp`. This verifies that query-level `s3_max_single_part_upload_size` takes priority over the global `<s3>` endpoint configuration when using the `s3()` table function. Also add a matching endpoint entry in `s3_settings_override.xml` for the new test URL. Note: the write path (`writeObject`) independently re-applies query-level settings, so this test provides defense-in-depth coverage for the `fromAST` fix rather than directly demonstrating the bug through writes. The test catches regressions if the re-application in `writeObject` is ever removed or if other code paths (reads, copies) consume these settings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The tests `04070_s3_disk_settings_override` and
`04093_s3_settings_query_override` previously inserted 1 000 000 sequential
`UInt64` values to ensure the resulting upload exceeded the 10 000-byte
multipart threshold. Under the flaky check (debug build with random settings),
the test wall time exceeded the 180-second per-iteration limit:
Reason: Test runs too long (> 180s). Make it faster.
Switch to a `String` column populated with random 100-byte values and 500 rows.
Random data does not compress below the 10 000-byte threshold, so the multipart
upload path is still exercised, while the test now runs an order of magnitude
faster.
CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100975&sha=65632f7709ffbd237a6d3ca5fbfb9d05289e207c&name_0=PR&name_1=Stateless%20tests%20%28amd_debug%2C%20flaky%20check%29
The flaky check runs new tests 100 times with randomized settings. Combined
with multipart upload, the size verification done by `WriteBufferFromS3`
when `s3_check_objects_after_upload = 1` (enabled in the default test user
profile via `enable_blobs_check.xml`) was observed to fail intermittently
against the local S3 mock with errors like:
Code: 499. DB::Exception: Object 04093_s3_settings_query_override.csv
from bucket test has unexpected size 52012 after upload, expected size
52048, it's a bug in S3 or S3 API. (S3_ERROR)
The tests `04070_s3_disk_settings_override` and
`04093_s3_settings_query_override` only need to verify that the multipart
upload code path is taken (via `S3CreateMultipartUpload`,
`S3UploadPart`, `S3CompleteMultipartUpload` profile events) — they do not
exercise upload integrity. Disable `s3_check_objects_after_upload` for
their `INSERT` so the test is robust against unrelated mock flakiness.
Also switch `04093` to a deterministic 100-byte string (`repeat('x', 100)`)
so the upload size is predictable.
CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100975&sha=1c45b088ed983537a2862b0c7ec78ac217220e98&name_0=PR&name_1=Stateless%20tests%20%28amd_asan_ubsan%2C%20flaky%20check%29
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The flaky check ran the test for 235s and reported "Test runs too long". The test exercises `S3ObjectStorage::applyNewSettings` which is triggered only by `SYSTEM RELOAD CONFIG`. Combined with multipart S3 upload, this path exceeds the 180s flaky-check threshold in debug builds when random settings (heavy filesystem cache injection, large `reduce_blocking_parts_sleep_ms`, etc.) inflate per-step latency. The bug under test is about settings priority on config reload, which is independent of those random settings. Other tests that use `SYSTEM RELOAD CONFIG` (e.g. `02944_dynamically_change_filesystem_cache_size`, `02961_storage_config_volume_priority`) follow the same convention. CI report: #100975
The flaky check runs the same test many times concurrently. Each instance issues `SYSTEM RELOAD CONFIG`, which is global server state and is serialized inside the server. With ~80+ parallel instances queued ahead of a given run, a single iteration was observed waiting roughly 2 minutes 45 seconds for `ConfigReloader: Loading config` to start after the query was accepted, blowing the 180s per-test flaky-check budget even though the actual reload work was fast. The other similar stateless tests that issue `SYSTEM RELOAD CONFIG` (`02961_storage_config_volume_priority`, `02944_dynamically_change_filesystem_cache_size`, `02933_change_cache_setting_without_restart`, `03032_dynamically_resize_filesystem_cache_2`) all carry `no-parallel` for the same reason. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100975&sha=571b40d386841412337c7f05c4a313a5c00b07e1&name_0=PR&name_1=Stateless%20tests%20%28amd_debug%2C%20flaky%20check%29 PR: #100975 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LLVM Coverage Report
Changed lines: 100.00% (19/19) · Uncovered code |
|
Hi — this PR may need backporting to Affected code: Why: Both If this should be backported, consider adding |

Fix the priority order for S3 settings to be:
<s3>config section (lowest priority)<storage_configuration>disk settingsTwo bugs were present:
In
S3ObjectStorage::applyNewSettings, endpoint settings from the global<s3>section were applied after disk config settings, overriding them. Swapped the order so disk config takes priority over global<s3>.In
StorageS3configuration (fromAST), endpoint settings from the global<s3>section were applied last without re-applying profile/query settings on top. AddedupdateFromSettingsafter endpoint application so that user/profile/query-level settings retain highest priority.Closes: #50029
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Fix S3 settings priority so that
storage_configurationdisk settings override global<s3>section, and user/profile/query-level settings override both.Documentation entry for user-facing changes