Throw on unknown configuration option in server config#100332
Throw on unknown configuration option in server config#100332alexey-milovidov wants to merge 124 commits into
Conversation
Previously, wrong or misspelled configuration entries in the server config were silently ignored, making it hard to diagnose configuration issues (e.g., `<max_thred_pool_size>` would be silently ignored). Added `ServerSettings::checkUnknownSettings` that validates all top-level config keys against known ServerSettings entries and a comprehensive list of known complex config sections. Throws `UNKNOWN_ELEMENT_IN_CONFIG` if an unrecognized key is found. The check is called on both server startup and config reload, and respects the existing `skip_check_for_incorrect_settings` escape hatch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…gration tests Address review feedback and fix CI failures: - Add `instrumentation_trace_log` to known system log sections (present in default `programs/server/config.xml`, caused Fast Test server startup failure) - Add `allow_experimental_cluster_discovery` to known cluster sections - Add `lemmatizers`, `synonyms_extensions`, `catboost_lib_path`, `path_to_regions_hierarchy_file`, `path_to_regions_names_files` to known dictionaries/functions sections - Add `transaction_log` to known miscellaneous sections Fix integration tests that use custom top-level config keys: - `test_http_handlers_config`: uses arbitrary config key referenced via `config://` in HTTP handler static responses - `test_database_hms`: uses non-standard `experimental_features` section Both tests now set `skip_check_for_incorrect_settings` to bypass the check. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ndling - Add command-line option names (`pid-file`, `config-file`, `log-file`, `errorlog-file`) to the known config sections allowlist. These keys are injected into the config by `argsToConfig` which processes all argv entries, including Poco options that were already handled via bindings. - Dynamically extract top-level keys referenced via `config://` in `http_handlers` `response_content`/`response_expression` fields and whitelist them. This preserves backward compatibility for deployments using custom config-backed static HTTP responses via `StaticRequestHandler`. - Remove the `skip_check_for_incorrect_settings` workaround from the `test_static_handler` integration test config, since the dynamic extraction now properly handles it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ig allowlist
The `pid-file` Poco option has `.binding("pid")`, so Poco stores the
value under the key `pid` in the config — causing `UNKNOWN_ELEMENT_IN_CONFIG`
on server startup.
Also add hyphen-less variants (`pidfile`, `configfile`, `logfile`,
`errorlogfile`) since `argsToConfig` stores raw CLI key names without
normalization.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Poco's `SystemConfiguration` injects a `system` top-level key into the
layered configuration. When `checkUnknownSettings` enumerates all
top-level keys via `config.keys("")`, it sees this Poco-internal key
and rejects it with `UNKNOWN_ELEMENT_IN_CONFIG`, preventing server
startup (as seen in the Fast test CI failure).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a test that verifies the server rejects unknown top-level config keys with `UNKNOWN_ELEMENT_IN_CONFIG`. This ensures future allowlist edits don't silently reintroduce startup exceptions or allow invalid keys through. The existing `test_http_handlers_config/test_static_handler` test already validates that custom top-level keys referenced via `config://` pass validation without `skip_check_for_incorrect_settings`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… allowlist, fix integration test - Add `core_path` to `known_complex_sections` — this key is read by `BaseDaemon` for the core dump directory and is used in stress test configs. Missing it caused all stress tests and install tests to fail with `UNKNOWN_ELEMENT_IN_CONFIG`. - Add `experimental_features` to `known_complex_sections` and remove the `skip_check_for_incorrect_settings` workaround from `test_database_hms/config.xml` (per review feedback). - Fix `test_unknown_config_option` integration test: when the server fails to start, the error message goes to the error log file (via `--errorlog-file`), not to container stdout. Read the error log from the host-mounted logs directory so the assertion can find the `UNKNOWN_ELEMENT_IN_CONFIG` message. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
The failures of "Flaky check" in "functions_bad_arguments" will be fixed by #101994. |
- Add `daemon`, `umask`, `help`, `version` to the known config sections allowlist. These are injected by Poco `ServerApplication` and `argsToConfig` when the server is started with `--daemon --umask` etc. Without this, those startup flows throw `UNKNOWN_ELEMENT_IN_CONFIG`. - Tighten prefix-based matching for dynamic config sections (`graphite_rollup`, `http_handlers`, `users`) to require an exact match or prefix followed by `_` separator. Previously, bare `starts_with` would silently accept typos like `graphite_rollupTypo`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This key is used in the AST fuzzer config (`fuzz-server-settings.xml`) for shared catalog support. Without it, the server fails to start with `UNKNOWN_ELEMENT_IN_CONFIG` during fuzzer CI runs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
The MSan stress test failure (MemorySanitizer: use-of-uninitialized-value, STID 4179-5154 or 4148-3044) is a known pre-existing issue unrelated to this PR. Fix: #102158 |
|
Pushed
New negative integration cases (fail to start with
|
| /// `retention*` block (`startsWith(key, "retention")`), throwing on anything else. It | ||
| /// reads `<retention>` only as `.age`/`.precision`, so its grandchildren are not checked. | ||
| const bool ok = rule_child == "regexp" || rule_child == "function" | ||
| || rule_child == "rule_type" || rule_child.starts_with("retention"); |
There was a problem hiding this comment.
appendGraphitePattern rejects more than unknown child names: a rule must contain at least one function or retention*, rule_type must parse to a known value, and a default rule must keep rule_type as all. As written, a typo section like <some_typo><pattern><regexp>.*</regexp></pattern></some_typo> passes looks_like_graphite_rollup and is exempted from the new unknown-key check, even though the real GraphiteMergeTree parser would throw if the section were used. Please mirror those semantic checks (or delegate the shape validation to the existing parser) and add a negative test for an allowed-key-but-invalid rule.
There was a problem hiding this comment.
Fixed in 1b10585feab. Tightened looks_like_graphite_rollup (renamed to graphite_rule_valid, now taking an is_default flag) to mirror the rest of appendGraphitePattern, not just the child names:
- a
<pattern>/<default>rule must carry at least one<function>or<retention*>(otherwise the parser throwsNO_ELEMENTS_IN_CONFIG), so<some_typo><pattern><regexp>.*</regexp></pattern></some_typo>is now rejected; - every
<rule_type>value must parse toall/plain/tagged/tag_list(otherwiseruleTypethrowsBAD_ARGUMENTS); - the
<default>rule keepsrule_typeasall(otherwiseBAD_ARGUMENTS: "Default must have rule_type all").
The tightening only ever rejects sections the parser also rejects, so it never turns a valid rollup config into a startup failure (verified against the existing positive graphite_arbitrary_name.xml/graphite_columns_only.xml). Added three negative tests — graphite_rule_no_aggregation.xml, graphite_default_rule_type_not_all.xml, graphite_invalid_rule_type.xml — one per added check.
There was a problem hiding this comment.
This is still not fully aligned with appendGraphitePattern. The current code at src/Core/ServerSettings.cpp:2737 treats any <function> child as valid and only flips has_function_or_retention. The real parser resolves that value with getAggregateFunctionNameAndParametersArray and AggregateFunctionFactory, then also rejects aggregate functions that allocate memory in an arena.
So a typo section such as <some_typo><pattern><function>not_a_function</function></pattern></some_typo> is still exempted as a GraphiteMergeTree section even though appendGraphitePattern would throw if the section were used, masking an unknown top-level key. Please validate the function value or delegate the shape check to the actual parser, and add a negative case with an invalid aggregate function.
There was a problem hiding this comment.
This is a real false-negative, but fully mirroring appendGraphitePattern's function handling is not possible here: checkUnknownSettings is a static method with no ContextPtr, whereas the parser resolves the value with getAggregateFunctionNameAndParametersArray(..., context) and then constructs the function via AggregateFunctionFactory::instance().get(name, action, {Float64}, params_row, properties) before rejecting one that allocatesMemoryInArena.
Reproducing only the name check (e.g. isAggregateFunctionName on a split-off name) would still miss the arena case and, worse, risks a new false positive — rejecting a valid <function>quantile(0.5)</function> if the name/params are not split exactly as the parser does — which turns a recognition miss into a server-won't-start regression (unverifiable in this change without a full integration run).
Since this is best-effort recognition of an arbitrary-named section — the <skip_check_for_incorrect_settings> escape hatch remains, and a genuinely bogus function is still caught by appendGraphitePattern the moment a GraphiteMergeTree table references the section — leaving this narrow false-negative in place is the fail-open choice. Leaving this thread open as a design decision (accept the best-effort limitation as-is, or thread a ContextPtr into checkUnknownSettings so the graphite recognition can delegate to the real parser).
…ig_option`
The tests `test_protocols_missing_handler_prefix_not_exempted` and
`test_config_ref_missing_path_does_not_exempt_unknown_key` reference
`configs/config.d/protocols_missing_handler_prefix.xml` and
`configs/config.d/static_handler_config_ref_missing_path.xml`, but those files
were never committed. `ClickHouseInstance` copies `main_configs` with
`shutil.copy` in `create_dir` *before* `self.logs_dir` is assigned, so the
missing files raised `FileNotFoundError` at cluster start, and the fixtures'
`except` blocks then failed with `AttributeError: 'ClickHouseInstance' object
has no attribute 'logs_dir'` instead of reaching the
`UNKNOWN_ELEMENT_IN_CONFIG` assertions.
Add both configs:
- `protocols_missing_handler_prefix.xml`: a `<protocols>` HTTP endpoint whose
`<handlers>custom_missing.handlers</handlers>` names a prefix that is absent
(the `<custom_missing>` section has a mistyped `<handlerss>` child), so
`config.has("custom_missing.handlers")` is false and the top-level
`<custom_missing>` must be rejected as a genuine unknown key.
- `static_handler_config_ref_missing_path.xml`: a `static` handler that
references `config://missing_payload.suffix` whose path does not exist
(`<missing_payload>` has no `<suffix>` child), so the top-level
`<missing_payload>` must be rejected.
CI report: #100332
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dGraphitePattern` `looks_like_graphite_rollup` exempted arbitrary-named rollup sections from the unknown-config-key check by their structure, but it only validated the child *names* of each `<pattern>`/`<default>` rule. `appendGraphitePattern` enforces more: a rule must carry at least one `<function>` or `<retention*>` block (`NO_ELEMENTS_IN_CONFIG`), every `<rule_type>` value must parse to a known type (`all`/`plain`/`tagged`/`tag_list`, otherwise `BAD_ARGUMENTS`), and the `<default>` rule's `rule_type` must be `all`. A typo'd section such as `<some_typo><pattern><regexp>.*</regexp></pattern></some_typo>` therefore passed the exemption even though the real parser would reject it if the section were used — a false negative that let an unknown top-level key slip through. Mirror those semantic checks in the renamed `graphite_rule_valid` helper (now taking an `is_default` flag). The tightening only ever rejects sections the parser also rejects, so it never turns a valid rollup config into a startup failure. Add three negative tests, one per added check: - `graphite_rule_no_aggregation.xml`: a `<pattern>` with only `<regexp>`. - `graphite_default_rule_type_not_all.xml`: a `<default>` with `rule_type` not `all`. - `graphite_invalid_rule_type.xml`: a `<pattern>` with an unrecognized `rule_type`. CI report: #100332 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Pushed
The remaining red, Did not re-merge |
…on` in the config-key exemption The unknown-config-key check exempts arbitrary-named `GraphiteMergeTree` rollup sections by recognizing their shape via `graphite_rule_valid`, which is meant to mirror `appendGraphitePattern` exactly. It only checked that a rollup rule's child was named `retention*`, but `appendGraphitePattern` materializes each `retention*` block by reading both `age` and `precision` with `getUInt`. A block missing either field (or with a non-numeric value) makes the parser throw, so a typo'd section such as `<some_typo><pattern><retention><precision>60</precision></retention></pattern></some_typo>` was wrongly exempted from the unknown-key check even though it could never load. Mirror the retention validation: require both `age` and `precision` and that they parse as unsigned integers, otherwise the section is not a valid rollup and must be rejected as an unknown top-level key. Added a negative integration test (`graphite_retention_missing_age.xml` / `test_graphite_retention_missing_age_rejected`). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Pushed
Triage of the CI reds on the previous commit
|
| /// solely to resolve `incl` references — so exempting its tags would let a genuinely | ||
| /// unknown top-level key pass merely because the lookup table happens to define a tag | ||
| /// of the same name, masking exactly the typo/misplaced-section class this check catches. | ||
| if (!merge_files.contains(to_canonical(include_from_path))) |
There was a problem hiding this comment.
This gate is correct for a plain lookup-table incl, but it is too broad for an external source consumed by a top-level <include>. ConfigProcessor::doIncludesRecursive treats an element named include specially: it inserts the children of the referenced node into the parent. For example, with <include_from>/etc/metrika.xml</include_from> and a top-level <include incl="extra_root"/>, an external source containing <extra_root><my_custom_section>...</my_custom_section></extra_root> produces a real top-level <my_custom_section> in the loaded server config. Since /etc/metrika.xml is not in merge_files, this branch skips the source, my_custom_section is never marked referenced, and startup now throws UNKNOWN_ELEMENT_IN_CONFIG for a configuration that was valid before. Please handle top-level include substitutions from external include_from sources and exempt only the child names they actually import, while keeping the negative lookup-table case covered.
There was a problem hiding this comment.
Good catch — confirmed and fixed in d6b625a8ae1.
checkUnknownSettings treated every external <include_from> source as a pure incl lookup table and skipped it entirely, so a top-level <include incl="X"/> that imports <X>'s children into the root (via ConfigProcessor::doIncludesRecursive) had those imported top-level keys rejected with UNKNOWN_ELEMENT_IN_CONFIG — a regression for a previously valid substitution.
The fix collects top-level <include incl="X"/> reference names from the server config only (the main config and merged config.d/conf.d fragments, never the users config, whose <include> targets a separate tree), resolves <X> in each <include_from> source exactly as the processor does (getRootNode(include_from)->getNodeByPath(name)), and exempts only the imported children — nothing else, so an unrelated typo elsewhere stays rejected. The negative lookup-table case (a plain incl attribute on a non-include element) is unchanged and still rejected, covered by test_external_include_from_source_does_not_exempt_unknown_key.
Added the positive regression test test_top_level_include_from_external_source_accepted.
There was a problem hiding this comment.
This is still not fully fixed for recursive top-level includes. The current code at src/Core/ServerSettings.cpp:2672 only inserts the immediate children of the referenced include_from node. But ConfigProcessor::doIncludesRecursive recurses after expanding an <include>, so a valid source like <root><include incl="nested"/></root> can import <nested>'s children into the server root. The merged config then contains those child top-level keys, while referenced_top_level_keys only contains include, so the validator still rejects a previously valid config with UNKNOWN_ELEMENT_IN_CONFIG.
Please recurse through imported top-level <include> nodes, or reuse the already processed config/contributing include information, when collecting the keys imported by top-level <include incl="..."/>.
|
@alexey-milovidov Both already have open fixes; linking them: 1. Upgrade-check Fix: #108560 (open, you approved 2026-06-28; currently only 2. Fix: #107438 (open, approved by alesapin 2026-06-19). Same keeper-client teardown flake as #107438 catches Both are open and ready for merge; no new fix needed. |
…nal `include_from` sources `checkUnknownSettings` treated every external `<include_from>` source as a pure `incl` lookup table that contributes no top-level key, so it skipped such sources entirely. But `ConfigProcessor::doIncludesRecursive` expands a top-level `<include incl="X"/>` element by inserting the *children* of node `X` (resolved from the `<include_from>` source) into the root, so those child tags become real top-level keys of the merged config. A previously valid configuration using a top-level `<include>` against an external source (e.g. `/etc/metrika.xml`) therefore started failing with `UNKNOWN_ELEMENT_IN_CONFIG`. Collect top-level `<include incl="X"/>` reference names from the server config (the main config and merged `config.d`/`conf.d` fragments only, never the users config, whose `<include>` targets a separate tree), resolve `<X>` in each `<include_from>` source exactly as the processor does (`getRootNode(include_from)->getNodeByPath(name)`), and exempt only the imported children — nothing else, so an unrelated typo stays rejected. The negative lookup-table case (a plain `incl` attribute on a non-`include` element) is unchanged and still rejected. Add positive integration test `test_top_level_include_from_external_source_accepted`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Pushed Fix — top-level
|
…v/from_zk and the `/etc/metrika.xml` default `ServerSettings::checkUnknownSettings` only mirrored the `incl` attribute of a top-level `<include>` and only picked up explicit `<include_from>` sources. Two previously valid configurations were therefore wrongly rejected with `UNKNOWN_ELEMENT_IN_CONFIG` (AI-Review "Request changes" on `ec66067e`): - A top-level `<include from_env="X"/>` or `<include from_zk="X"/>` is expanded by `doIncludesRecursive` by importing the *children* of the referenced node as top-level keys, but only `incl` was handled. `from_env` is now resolved deterministically (the environment value is wrapped as `<from_env>...</from_env>` and its children are exempted, exactly as `ConfigProcessor` does). `from_zk` cannot be resolved without a ZooKeeper connection at config-validation time, so the check is skipped (fail open, with a warning) when such a top-level element is present, rather than falsely rejecting the imported keys. - When the server config declares no `<include_from>`, `ConfigProcessor` falls back to the default substitution source `/etc/metrika.xml`. A top-level `<include incl="X"/>` then imports `<X>`'s children from it. The validator now seeds the same default so those imported keys are exempted. Added integration regression tests `test_top_level_include_from_env_accepted`, `test_top_level_include_from_zk_accepted`, and `test_metrika_default_include_source_top_level_include_accepted`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Pushed 1. Top-level
|
…n config section
The `Fast test` job's `Start ClickHouse Server` step failed with
`Code: 137. DB::Exception: Unknown element
'user_defined_executable_function_drivers_config' found in config`
(`UNKNOWN_ELEMENT_IN_CONFIG`), so the server did not start.
That top-level key was added to the default `programs/server/config.xml`
in `b767a27c206` ("Load executable UDF drivers during server startup").
It is read directly from the config by
`Context::loadUserDefinedExecutableFunctionDrivers` via
`getMultipleValuesFromConfig`, not declared in `ServerSettings`, so the
new unknown-config-option check did not recognize it and rejected a
valid config.
Add it to `known_complex_sections`, next to its sibling
`user_defined_executable_functions_config`. The two accompanying
`ServerSettings` (`allow_experimental_executable_udf_drivers` and
`dynamic_user_defined_executable_functions_path`) are `DECLARE`d
settings and are already covered by `known_keys`.
Also add the key to the `existing_keys.xml` regression fixture, whose
purpose is to guarantee that top-level keys consumed outside
`ServerSettings` keep the server starting.
CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100332&sha=fbf709911c71a573640c84af8aaa6e071bfd370e&name_0=PR&name_1=Fast%20test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nknown-key check `ServerSettings::checkUnknownSettings` recognizes arbitrary-named `GraphiteMergeTree` rollup sections by their shape via the `looks_like_graphite_rollup` helper, which returned `false` for a section with zero children — so such a section was rejected with `UNKNOWN_ELEMENT_IN_CONFIG` and the server did not start. But `setGraphitePatternsFromConfig` only throws `NO_ELEMENTS_IN_CONFIG` when the section is *missing*: a present-but-empty section (e.g. `<retention_cfg/>`) is accepted, keeping the default column names and an empty pattern list. So the new check rejected a previously valid config — a regression. Accept the empty case, mirroring the parser exactly. An empty section takes no effect, so exempting it does not weaken typo detection in any meaningful way. Added the integration regression test `test_graphite_empty_section_accepted` (a present-but-empty `<retention_empty/>` section; the node must start). Addresses the AI-Review finding at `src/Core/ServerSettings.cpp:2794`. #100332 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Pushed Fixed — empty arbitrary-named Graphite section (
|
| bool has_function_or_retention = false; | ||
| for (const auto & rule_child : rule_children) | ||
| { | ||
| const bool ok = rule_child == "regexp" || rule_child == "function" |
There was a problem hiding this comment.
appendGraphitePattern also validates the regexp payload semantically: after tag_list rewriting it constructs OptimizedRegularExpression and throws CANNOT_COMPILE_REGEXP on malformed patterns. graphite_rule_valid currently accepts any <regexp> child by name alone, so a typo section like <some_typo><pattern><regexp>(</regexp><function>any</function></pattern></some_typo> is still exempted from the new unknown-key check even though the real GraphiteMergeTree parser would reject it.
Please compile the regex here (mirroring the tag_list path too) or delegate this rule check to appendGraphitePattern, and add a negative case for an invalid Graphite regexp.
There was a problem hiding this comment.
This is the same best-effort-recognition limitation as the <function> thread above, and it can't be cleanly delegated to the parser either.
In appendGraphitePattern the regexp is compiled at the very end, after the key loop, and for a tag_list rule the raw <regexp> value is first rewritten by buildTaggedRegex and only then passed to OptimizedRegularExpression. Mirroring that here would mean either:
- calling
appendGraphitePattern— but the same loop resolves<function>viagetAggregateFunctionNameAndParametersArray(..., context)andAggregateFunctionFactory, so it needs theContextPtrthisstaticmethod doesn't have (the blocker from the<function>thread); or - reproducing the compile path standalone — a second copy of
buildTaggedRegex(which itself indexestags[0]with no empty-guard) plus theOptimizedRegularExpressionconstruction, inside astaticconfig check.
The standalone copy's dangerous failure mode is a false positive: for a tag_list rule the raw value is not what the parser actually compiles (the buildTaggedRegex output is), so any divergence rejects a valid Graphite section and the server fails to start. Today such an arbitrarily-named section (with no table referencing it yet) lets the server boot and only fails at table creation with CANNOT_COMPILE_REGEXP; converting that into a startup rejection with UNKNOWN_ELEMENT_IN_CONFIG is a compatibility change I don't want to ship without an integration run proving no valid config is newly rejected.
So this stays best-effort recognition — the <skip_check_for_incorrect_settings> escape hatch remains, and a genuinely malformed regexp is still caught by appendGraphitePattern when a GraphiteMergeTree table references the section. Leaving it open for the same call as the <function> thread: accept the limitation, or thread a ContextPtr through so the recognition can delegate to the real parser (which would then cover the function resolution, the regexp compile, and the tag_list rewrite in one place).
|
Status on
|
The previous commit `76bacb45a5e` changed `looks_like_graphite_rollup` to return `true` for a childless section, so that a present-but-empty `GraphiteMergeTree` rollup section (e.g. `<retention_cfg/>`) keeps starting. But `Poco::AbstractConfiguration::keys` reports only child *elements*, so a scalar leaf key such as `<max_thred_pool_size>16</max_thred_pool_size>` (a typo'd setting — precisely what this check exists to catch) also has zero children and was therefore exempted from `UNKNOWN_ELEMENT_IN_CONFIG`. This regressed the core of the feature: every unknown top-level key that carries a scalar value silently passed, so all the negative `test_unknown_config_option` cases (`test_unknown_config_option_rejected`, `test_users_prefix_typo_rejected`, the `..._not_exempted` / reload cases, ...) failed uniformly across every integration-test shard. A `GraphiteMergeTree` section is always a container and never carries a scalar value of its own, so distinguish the two: exempt a childless section only when its (whitespace-trimmed) value is empty — a genuine empty container — and let a childless section with a non-empty value fall through to be rejected. This keeps `test_graphite_empty_section_accepted` (`<retention_empty/>`) passing while restoring rejection of valued leaf typos. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100332&sha=76bacb45a5ea5b683284560152c406c674e9706d Related: #100332 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Pushed Root cause
if (children.empty())
- return false;
+ return true;The intent was to accept a present-but-empty FixA if (children.empty())
return Poco::trim(config.getString(section, "")).empty();Verified against every failing negative case (all inject valued leaf keys: This is orthogonal to the two open |
|
Status on
|
Addresses the two remaining (non-dismissed) findings of the AI review on this
PR, both false negatives in `ServerSettings::checkUnknownSettings` that let a
genuinely unknown top-level key escape validation:
1. `GraphiteMergeTree` retention ordering. `graphite_rule_valid` validated that
each `retention*` block has an unsigned `age`/`precision`, but did not mirror
`appendGraphitePattern`'s final `::sort(..., compareRetentions)`, which throws
`BAD_ARGUMENTS` ("Age and precision should only grow up") unless every compared
pair's `age` and `precision` grow together. An arbitrary top-level section whose
retentions are not co-monotonic therefore passed `looks_like_graphite_rollup`
even though the real parser rejects the rule. Fix: collect the `(age, precision)`
pairs in config order and replicate that exact `::sort` (same comparator logic,
same input), treating a throw as "not a valid rollup rule". Because a valid
(co-monotonic) rule never triggers a throwing comparison regardless of the sort's
internal order, this only rejects sections the parser also rejects — it never
turns a valid rollup config into a startup failure. Unlike the deliberately
deferred `<function>`/`<regexp>` checks, this needs no `ContextPtr`.
2. `include_from` source separation. Top-level server `<include incl="X"/>` refs
were resolved against every discovered `include_from` source, including the
active users config's source. `ConfigProcessor` resolves a server `<include>`
only against the server config's own `<include_from>`, never the users config's,
so a users source that happens to define `<X>` could whitelist a top-level server
key the server config never imports. Fix: track the server-side `include_from`
sources (server-config files, the merged server config's `include_from`, and the
default `/etc/metrika.xml`) separately and resolve top-level include refs only
against them; a users-config source is still consulted for the merged-fragment
exemption but never for include-ref resolution.
Adds two focused negative integration tests in `test_unknown_config_option`:
`test_graphite_retention_bad_ordering_rejected` (two `retention*` blocks whose
`age` ascends while `precision` descends) and
`test_users_config_include_from_source_does_not_exempt_server_key` (server include
ref plus a users-config `include_from` source defining the referenced node).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| /// scalar value of its own, so a childless section with a non-empty value is not a rollup | ||
| /// section: exempt the empty container, but let the valued leaf fall through and be rejected. | ||
| if (children.empty()) | ||
| return Poco::trim(config.getString(section, "")).empty(); |
There was a problem hiding this comment.
GraphiteMergeTree compatibility case it is trying to preserve. Any unknown top-level tag with no child elements now reaches this branch, so a config like <my_typo/> or <my_unknown></my_unknown> is silently accepted as “Graphite-like” just because config.getString(section, "") trims to empty.
That breaks the PR’s core contract for one of the simplest typo shapes: empty unknown options are ignored again instead of throwing UNKNOWN_ELEMENT_IN_CONFIG. Unless we have some independent signal that the section is actually referenced by GraphiteMergeTree, this branch needs to reject the empty case rather than treating every childless empty tag as a valid rollup section.
LLVM Coverage ReportChanged lines: Changed C/C++ lines covered: 493/515 (95.73%) · Uncovered code |

Previously, wrong or misspelled configuration entries in the server config were silently ignored, making it hard to diagnose configuration issues (e.g.,
<max_thred_pool_size>would be silently ignored instead of alerting the user).Added
ServerSettings::checkUnknownSettingsthat validates all top-level config keys against knownServerSettingsentries and a comprehensive list of known complex config sections. ThrowsUNKNOWN_ELEMENT_IN_CONFIGif an unrecognized key is found.The check runs on both server startup and config reload. It respects the existing
<skip_check_for_incorrect_settings>escape hatch.Known keys come from three sources:
ServerSettingsscalar setting names (settings without paths)ServerSettingspath-based settings (e.g.,logger,openSSL,distributed_ddl)graphite_rollup_*,http_handlers_*)Sections referenced from elsewhere in the config (
inclsubstitutions,config://references in HTTP handlers, custom protocol handler sections, and top-level tags ofinclude_fromsource files kept underconfig.d/) are exempt from the check.Known limitation (ZooKeeper-backed substitutions): resolving a
from_zksubstitution would require a ZooKeeper connection at config-validation time, before the<zookeeper>section itself has been validated, so:<include_from>is specified with afrom_zksubstitution inside a users-config fragment and its source file is kept underconfig.d/, the source's top-level tags are not exempted. Such setups can either move the substitution source out ofconfig.d/or set<skip_check_for_incorrect_settings>.<include from_zk=.../>(which imports the referenced ZooKeeper node's children as top-level keys), the whole check is skipped for that config, with a warning, so a valid config is never rejected — but a genuine typo alongside such an include is not caught.Literal paths,
from_env(including a top-level<include from_env=.../>), the default/etc/metrika.xmlsubstitution source, andinclin the main server config are all covered.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
The server now throws an exception on unknown configuration options in the server config, helping catch typos and misconfigured options early. The check can be disabled with
<skip_check_for_incorrect_settings>.Documentation entry for user-facing changes
🤖 Generated with Claude Code