Supports automatically fill on cluster value by JingYanchao · Pull Request #94138 · ClickHouse/ClickHouse · GitHub
Skip to content

Supports automatically fill on cluster value#94138

Open
JingYanchao wants to merge 43 commits into
ClickHouse:masterfrom
JingYanchao:johnjing/supports-rewrite-on-cluster
Open

Supports automatically fill on cluster value#94138
JingYanchao wants to merge 43 commits into
ClickHouse:masterfrom
JingYanchao:johnjing/supports-rewrite-on-cluster

Conversation

@JingYanchao

@JingYanchao JingYanchao commented Jan 14, 2026

Copy link
Copy Markdown
Contributor

Changelog category (leave one):

  • Experimental Feature

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Add two new settings allow_experimental_automatic_fill_on_cluster_mode & cluster_for_automatic_fill_mode to support filling ON CLUSTER automatically. This can simplify maintenance when we always only have one cluster.


Note

Medium Risk
Touches core query execution to mutate the parsed AST and inject ON CLUSTER for eligible statements, which could change DDL routing/behavior when enabled. Risk is mitigated by being opt-in via new experimental settings and by skipping Replicated databases and non-initial queries.

Overview
Adds an experimental mode to automatically fill a missing ON CLUSTER clause from a configured default cluster.

Introduces new settings allow_experimental_automatic_fill_on_cluster_mode and cluster_for_automatic_fill_mode, updates executeQuery to inject the cluster into ASTQueryWithOnCluster when enabled (but skips queries targeting Replicated databases), and adds stateless tests to verify both the injection behavior and the Replicated-database exclusion.

Reviewed by Cursor Bugbot for commit 9829571. Bugbot is set up for automated code reviews on this repo. Configure here.

@CLAassistant

CLAassistant commented Jan 14, 2026

Copy link
Copy Markdown

Comment thread src/Core/Settings.cpp Outdated
Comment thread src/Interpreters/executeQuery.cpp
@alexey-milovidov

Copy link
Copy Markdown
Member

This needs more thought.

  1. The behavior will conflict with the Replicated database.
  2. The names are quite convoluted, we need something better, but at the same time, consistent with cluster_for_parallel_replicas.

@JingYanchao

Copy link
Copy Markdown
Contributor Author

2. convoluted

Thanks a lot, I will further optimize these two points later. Then I will add a test case

@JingYanchao JingYanchao force-pushed the johnjing/supports-rewrite-on-cluster branch from aefd434 to 0d9467d Compare January 19, 2026 07:46
@JingYanchao JingYanchao changed the title Supports automatically add on cluster value Supports automatically fill on cluster value Jan 20, 2026
@GrigoryPervakov GrigoryPervakov added the can be tested Allows running workflows for external contributors label Jan 21, 2026
@clickhouse-gh

clickhouse-gh Bot commented Jan 21, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [6548354]

Summary:

job_name test_name status info comment
Fast test FAIL
03789_automatic_fill_on_cluster_mode FAIL cidb
Fast test (arm_darwin) DROPPED
Build (amd_debug) DROPPED
Build (amd_asan_ubsan) DROPPED
Build (amd_tsan) DROPPED
Build (amd_msan) DROPPED
Build (amd_binary) DROPPED
Build (arm_debug) DROPPED
Build (arm_asan_ubsan) DROPPED
Build (arm_tsan) DROPPED

AI Review

Summary

This PR adds an experimental setting pair that rewrites eligible initial queries with a missing ON CLUSTER to a configured default cluster. The implementation now covers many earlier edge cases, and the current Praktika report shows no failed jobs, but there are still unresolved classification gaps where auto-fill can either leave eligible statements local or distribute access-storage operations that are not safe to run on every host.

Findings

⚠️ Majors

  • [src/Interpreters/executeQuery.cpp:1524] ASTMoveAccessEntityQuery is not classified before auto-fill. With replicated access storage plus a local storage, MOVE USER u TO local_directory can get ON CLUSTER injected, so each DDL worker runs moveAccessEntities; one worker can remove the shared replicated entity before the others resolve it, or only one host can get the local copy. Suggested fix: skip MOVE when replicated access storage is involved, or resolve source and destination storages and auto-fill only when both sides are safe to move independently on every host.
  • [src/Interpreters/executeQuery.cpp:1529] [dismissed by author -- https://github.com/Supports automatically fill on cluster value #94138#discussion_r3479141991] The access-storage predicate still checks whether any replicated access storage exists, not whether the explicit target storage is replicated. CREATE USER u IN local_directory / DROP USER u FROM local_directory therefore stay local even though explicit ON CLUSTER is valid and needed to keep the local storage consistent on every host. Suggested fix: make the access predicate target-storage aware and align removeOnClusterClauseIfNeeded with the same rule.
  • [src/Interpreters/executeQuery.cpp:1668] [dismissed by author -- https://github.com/Supports automatically fill on cluster value #94138#discussion_r3411074214] The Replicated-database table-level ALTER skip is still based only on the database engine. For operations that DatabaseReplicated::shouldReplicateQuery explicitly does not replicate, such as DROP PARTITION, DROP DETACHED PARTITION, FREEZE / UNFREEZE, and UNLOCK SNAPSHOT, supported ON CLUSTER operations can be silently kept local, especially for non-replicated tables inside a Replicated database. Suggested fix: make the skip operation-aware; keep unsupported ATTACH / FETCH local, but either route supported non-database-replicated alters or document and test the narrower contract.
Tests
  • ⚠️ Add integration coverage with both replicated and local access storage for explicit-storage access DDL and MOVE so the storage-selection contract is proven.
  • ⚠️ Add a focused Replicated-database case for supported, non-database-replicated table-level ALTERs, or an explicit test documenting that auto-fill intentionally leaves them local.
Final Verdict

Status: ⚠️ Request changes

Minimum required actions: fix the new MOVE access-entity gap, and either fix or explicitly settle the two remaining semantic gaps around target-aware access storage and operation-aware Replicated-database ALTER handling.

@clickhouse-gh clickhouse-gh Bot added the pr-feature Pull request with new product feature label Jan 21, 2026
@JingYanchao JingYanchao force-pushed the johnjing/supports-rewrite-on-cluster branch from 232bf45 to 89c6244 Compare January 22, 2026 07:08
@JingYanchao JingYanchao force-pushed the johnjing/supports-rewrite-on-cluster branch from c9cfebb to 1746eae Compare February 27, 2026 02:57
@JingYanchao JingYanchao force-pushed the johnjing/supports-rewrite-on-cluster branch from 1746eae to d77ffbd Compare March 17, 2026 02:42
Comment thread src/Interpreters/executeQuery.cpp Outdated
Comment thread src/Core/Settings.cpp Outdated
@RinChanNOWWW

RinChanNOWWW commented Mar 31, 2026

Copy link
Copy Markdown
Contributor
  • The behavior will conflict with the Replicated database.

Maybe we can use ignore_on_cluster_for_replicated_database added by #92872 to solve this? @alexey-milovidov

optimize code

optimize name

rename test file

fix check style bug

fix test case

fix test case

# Conflicts:
#	src/Interpreters/executeQuery.cpp
@JingYanchao JingYanchao force-pushed the johnjing/supports-rewrite-on-cluster branch from d77ffbd to c1629fd Compare April 8, 2026 12:55
Comment thread tests/queries/0_stateless/03789_automatic_fill_on_cluster_mode.sql Outdated
Comment thread src/Interpreters/executeQuery.cpp Outdated
Comment thread src/Core/Settings.cpp
@clickhouse-gh clickhouse-gh Bot added pr-experimental Experimental Feature and removed pr-feature Pull request with new product feature labels Apr 9, 2026
@JingYanchao

Copy link
Copy Markdown
Contributor Author

This needs more thought.

  1. The behavior will conflict with the Replicated database.
  2. The names are quite convoluted, we need something better, but at the same time, consistent with cluster_for_parallel_replicas.

This settings has now been marked as experimental, and a Context flag has been added to maybeRemoveOnCluster to resolve the conflict with ReplicatedDatabase. Could you please review it again? @alexey-milovidov

Comment thread src/Interpreters/executeQuery.cpp Outdated
Comment thread src/Interpreters/executeQuery.cpp
alexey-milovidov and others added 2 commits June 15, 2026 04:11
…cases

Address the remaining review threads about the auto-fill blocklist. Each case
either turned a valid local query into an exception, or skipped a statement that
ON CLUSTER actually supports:

- UDF / workload / resource DDL: these reject `ON CLUSTER` only when their
  storage `isReplicated()`. With the default, non-replicated storage `ON CLUSTER`
  is valid and distributes the statement, so auto-fill must fill it. Gate the
  skip on `getUserDefinedSQLObjectsStorage().isReplicated()` /
  `getWorkloadEntityStoragePtr()->isReplicated()` instead of skipping these
  families unconditionally.

- Unsupported `ALTER` commands: `executeDDLQueryOnCluster` rejects
  `ATTACH PARTITION` / `FETCH PARTITION` with `Unsupported type of ALTER query`.
  Filling the cluster for such an `ALTER` would make a valid local statement
  throw, so skip auto-fill when any command is not supported `ON CLUSTER`
  (reusing `isSupportedAlterTypeForOnClusterDDLQuery`).

- Database-level `ALTER` / `RENAME`: `ALTER DATABASE ... MODIFY COMMENT/SETTING`
  and `RENAME DATABASE` are not coordinated by the `Replicated` database engine
  (`InterpreterAlterQuery::executeToDatabase` / `InterpreterRenameQuery::executeToDatabase`
  apply them on the initiator only). Like `DROP DATABASE`, they need `ON CLUSTER`
  to reach every node, so a `Replicated` target database must no longer be a
  reason to skip auto-fill for them. Table-level statements still stay local.

Extend `03789_automatic_fill_on_cluster_mode` (functions now rewritten
ON CLUSTER; `ATTACH PARTITION` stays local) and
`03789_automatic_fill_on_cluster_replicated_database` (`ALTER DATABASE` /
`RENAME DATABASE` are rewritten ON CLUSTER).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Interpreters/executeQuery.cpp
Comment thread src/Interpreters/executeQuery.cpp
bool is_database_level_alter
= alter_query && alter_query->alter_object == ASTAlterQuery::AlterObjectType::DATABASE;
if (!is_database_level_alter)
target_is_replicated_database = is_replicated_database(query_with_table->getDatabase());

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Replicated-database skip is too broad for table-level ALTER commands that the database engine explicitly does not replicate.

Concrete trace: ALTER TABLE repl_db.t DROP PARTITION ... reaches this branch, is_replicated_database returns true, and auto-fill leaves cluster empty. But DatabaseReplicated::shouldReplicateQuery returns false for DROP PARTITION/DROP DETACHED PARTITION, FREEZE/UNFREEZE, and UNLOCK SNAPSHOT, while executeDDLQueryOnCluster only rejects ATTACH PARTITION and FETCH PARTITION. So a valid ON CLUSTER operation is silently kept local instead of being routed to the configured cluster.

Please base this skip on the specific operation being database-replicated, not just on the database engine name. The existing ATTACH/FETCH guard can still keep unsupported ON CLUSTER alters local.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not changed yet — this one looks like a genuine semantic call rather than a mechanical fix, so flagging it for @alexey-milovidov.

DatabaseReplicated::shouldReplicateQuery returns false for DROP PARTITION / DROP DETACHED PARTITION / FREEZE/UNFREEZE / UNLOCK SNAPSHOT precisely because, for a ReplicatedMergeTree table, those operations are replicated by the table's own machinery (its replication log), not by the database engine. So the current "keep local" behavior is not a silent loss: the partition drop still reaches the other replicas of the table. Routing such an ALTER ON CLUSTER would push it to every node of the configured cluster in addition to the table's own replication, which for a Replicated database whose replica set does not match the cluster topology can double-apply or target nodes that do not host the table.

There is also a practical obstacle to basing the skip on shouldReplicateQuery from executeQuery (before interpretation): it resolves the storage via getTable (throws for IF EXISTS / not-yet-existing targets) and depends on runtime state (is_replicated_database_internal, has_many_shards, KeeperMap checks) that is not necessarily settled at the auto-fill point.

So whether to route these non-database-replicated table ALTERs ON CLUSTER is a behavioral decision (and the conservative current behavior may be the correct one for ReplicatedMergeTree). Leaving the thread open for your call; the other two blockers from this review round are addressed in the two commits just pushed.

alexey-milovidov and others added 2 commits June 17, 2026 06:45
The previous code skipped auto-fill for the whole RENAME on the first
`Replicated`-database element, silently leaving ordinary elements local. For a
mixed statement such as
`RENAME TABLE ordinary_db.o TO ordinary_db.o2, repl_db.r TO repl_db.r2` this
leaves `cluster` empty: `InterpreterRenameQuery` then applies the ordinary
rename on the initiator and throws on the unsupported multi-table `Replicated`
rename, leaving the ordinary table renamed locally.

Mirror the multi-table `DROP` guard: classify every rename element and reject a
genuinely mixed statement (a `Replicated`-database element alongside an ordinary
one) with `BAD_ARGUMENTS`, since a single `cluster` field cannot express both
intents. Pure-ordinary renames are still auto-filled and pure-replicated renames
are still skipped. The rejection happens before any rename is applied, so no
ordinary target is changed locally. Added `Test 9` to
`03789_automatic_fill_on_cluster_replicated_database` covering this.

Addresses the AI review thread on `src/Interpreters/executeQuery.cpp`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed collections

Access-control queries (`CREATE USER`/`ROLE`/`QUOTA`/`ROW POLICY`/`SETTINGS
PROFILE`, `DROP` of access entities, `GRANT`) and named-collection queries
(`CREATE`/`ALTER`/`DROP NAMED COLLECTION`) accept `ON CLUSTER` only when their
storage is not replicated, exactly like the user-defined functions, workloads and
resources already handled here. With replicated access storage a local
`CREATE USER u` succeeds and replicates through `ReplicatedAccessStorage`, but
auto-fill turned it into `CREATE USER u ON CLUSTER ...`, which the interpreter
rejects unless `ignore_on_cluster_for_replicated_access_entities_queries` is
enabled (and which would just be stripped back when it is). Replicated named
collections have the same shape with
`ignore_on_cluster_for_replicated_named_collections_queries`.

Decide per storage using the same predicates as `removeOnClusterClauseIfNeeded`:
skip auto-fill when the access storage contains a `ReplicatedAccessStorage`, or
when the named-collection factory uses replicated storage. Non-replicated storage
keeps `ON CLUSTER` valid and eligible for auto-fill.

Addresses the AI review thread on `src/Interpreters/executeQuery.cpp`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Picked this up to address the latest AI review round (the three new threads on executeQuery.cpp). Pushed 8f31df6fbc8..b46df0fb90a.

Addressed two of the three blockers (resolved their threads):

  • Replicated access entities / named collections (b46df0fb90a). Access-control queries (CREATE USER/ROLE/QUOTA/ROW POLICY/SETTINGS PROFILE, DROP of access entities, GRANT) and named-collection queries (CREATE/ALTER/DROP NAMED COLLECTION) now feed the existing per-storage target_forbids_on_cluster decision, using the same predicates as removeOnClusterClauseIfNeeded: getAccessControl().containsStorage(ReplicatedAccessStorage::STORAGE_TYPE) and NamedCollectionFactory::instance().usesReplicatedStorage(). This stops auto-fill from turning a working local CREATE USER u (replicated access storage) into a CREATE USER u ON CLUSTER ... that the interpreter rejects. Non-replicated storage keeps ON CLUSTER eligible, exactly like UDFs/workloads/resources.

  • Mixed multi-element RENAME (8f31df6fbc8). The RENAME branch now classifies every element and rejects a genuinely mixed statement (a Replicated-database source/destination alongside an ordinary target) with BAD_ARGUMENTS, mirroring the multi-table DROP guard you already added. The rejection happens during auto-fill, before InterpreterRenameQuery runs, so no ordinary target is renamed locally. Added Test 9 to 03789_automatic_fill_on_cluster_replicated_database proving the ordinary table is untouched after the rejected mixed RENAME.

Left the third blocker open for your call (Replicated-database skip for non-database-replicated table ALTERs — DROP PARTITION/FREEZE/etc.). I do not think this is a mechanical fix: DatabaseReplicated::shouldReplicateQuery returns false for those precisely because a ReplicatedMergeTree table replicates them through its own log, so keeping them local is not a silent loss — they still reach the other replicas. Routing them ON CLUSTER would push them to every node of the configured cluster in addition to the table’s own replication, which can double-apply when the cluster topology does not match the database’s replica set. There is also a practical obstacle: shouldReplicateQuery resolves the storage via getTable (throws on IF EXISTS/missing targets) and depends on runtime state not necessarily settled at the auto-fill point. So whether the conservative current behavior is correct is a design decision; details in the thread reply.

Test gap (the AI’s third ⚠️ test request). The replicated access / named-collection regression test needs server config (Keeper-backed user_directories/named_collections_storage), so it does not fit a plain stateless .sql; I left it out rather than ship a test I cannot run here. It is probably best as an integration test, or a stateless test with a dedicated config. Happy to add it if you prefer a particular shape.

CI. The pre-existing reds on 749680d3 are unrelated to this change: 03580_join_runtime_filter and 02973_block_number_sparse_serialization_and_mutation failed as Timeout in the slow amd_msan, WasmEdge shard, and the Performance Comparison failures (group_by_fixed_keys, count_from_formats) are perf-comparison instability — none can be affected by a default-off experimental setting that only touches the DDL auto-fill path. A fresh run is in progress on b46df0fb90a.

Build note. I could not run a full build in this environment (disk full), so the two C++ changes were verified with clang -fsyntax-only against the build’s compile command (clean), and Test 9’s rejection is deterministic ({ serverError BAD_ARGUMENTS }). The runtime paths would benefit from a local build + test run on your side before merge. No approval added — not my PR.

Comment thread src/Interpreters/executeQuery.cpp
alexey-milovidov and others added 2 commits June 19, 2026 02:28
The automatic `ON CLUSTER` fill decision in `executeQueryImpl` read
`allow_experimental_automatic_fill_on_cluster_mode` and
`cluster_for_automatic_fill_mode` before
`InterpreterSetQuery::applySettingsFromQuery` had extracted query-level
settings from the AST. As a result the feature could not be switched per
query: with the profile setting off, a statement carrying
`SETTINGS allow_experimental_automatic_fill_on_cluster_mode = 1` stayed
local, and with the profile setting on, `SETTINGS
allow_experimental_automatic_fill_on_cluster_mode = 0` still injected
`ON CLUSTER`.

Apply the query's `SETTINGS` clause before the auto-fill decision (and
before the corresponding interpreter), so query-level settings take
effect. The `settings` reference points at the live context settings,
which `applySettingsFromQuery` mutates in place, so the decision below
now observes the per-query values.

Add `03789_automatic_fill_on_cluster_query_settings` covering both the
per-query enable and per-query disable cases.

Addresses the AI review (Request changes) on
ClickHouse#94138

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Picked this up to address the AI review (Request changes) about query-level SETTINGS ordering. Pushed b46df0fb90a..ed40d066300.

Setting-ordering fix (a9c63f9). The automatic ON CLUSTER fill decision in executeQueryImpl read allow_experimental_automatic_fill_on_cluster_mode and cluster_for_automatic_fill_mode before InterpreterSetQuery::applySettingsFromQuery had extracted the query-level SETTINGS from the AST, so the feature could not be switched per query. Moved applySettingsFromQuery up to right after the transaction check (and removed it from its later position so it still runs exactly once), so the auto-fill decision — and the corresponding interpreter — observe the per-query values. applySettingsFromQuery only ran after the decision before, and the settings reference points at the live context settings it mutates in place, so the values are now in effect for the block below.

New test 03789_automatic_fill_on_cluster_query_settings. Covers both required cases: profile off + SETTINGS allow_experimental_automatic_fill_on_cluster_mode = 1, cluster_for_automatic_fill_mode = 'test_shard_localhost' injects ON CLUSTER; profile on + SETTINGS allow_experimental_automatic_fill_on_cluster_mode = 0 stays local.

Verification against a locally built server (embedded Keeper + test_shard_localhost, distributed DDL working):

  • A/B confirmed both directions (CREATE TABLE ... ON CLUSTER test_shard_localhost ... when enabled per query; plain local CREATE TABLE when disabled per query).
  • New test passes 5/5 (not flaky); the two existing auto-fill tests still pass; a settings/log_queries regression batch — incl. 03290_mix_engine_and_query_settings (the engine/query settings separation this relies on), 01593_insert_settings, 01231_log_queries_min_type, 01546_log_queries_min_query_duration_ms — all pass.

Also merged latest master (the branch was ~1291 commits behind). The one remaining unresolved review thread is addressed and resolved.

alexey-milovidov and others added 2 commits June 26, 2026 04:21
After merging `master`, a new `26.7` version block was opened. The
settings `allow_experimental_automatic_fill_on_cluster_mode` and
`cluster_for_automatic_fill_mode` were registered in the `26.6` block,
but `02995_new_settings_history` requires settings absent from the
`26.6.1` baseline to appear in a version block greater than `26.6`.
Move both entries to the `26.7` block.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Merged the current master (the branch was a week behind, ~2746 commits) to refresh CI on an up-to-date base. The merge was clean with no conflicts, and the PR's net diff vs master is unchanged (10 files, +664/-3).

One follow-up was needed: master opened a new 26.7 version block in SettingsChangesHistory.cpp while this PR's two settings were still registered under 26.6. Since 02995_new_settings_history requires settings absent from the 26.6.1 baseline to appear in a block greater than 26.6, I moved allow_experimental_automatic_fill_on_cluster_mode and cluster_for_automatic_fill_mode to the 26.7 block (commit fb448604f76).

Verified src/Interpreters/executeQuery.cpp and src/Core/SettingsChangesHistory.cpp still compile cleanly against the merged headers (no API drift). Fresh CI is running.

The two reds on the previous run were both confirmed master-wide flakes unrelated to this PR (off-by-default experimental feature): 03518_alter_logical_race timeout and the Cannot start clickhouse-server / Unexpected exception in refresh scheduling (STID 2508-34af) stress failure — both fail daily across many unrelated PRs on master.

Comment thread src/Interpreters/executeQuery.cpp
|| out_ast->as<ASTCreateSettingsProfileQuery>() || out_ast->as<ASTDropAccessEntityQuery>()
|| out_ast->as<ASTGrantQuery>())
target_forbids_on_cluster
= context->getAccessControl().containsStorage(ReplicatedAccessStorage::STORAGE_TYPE);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This predicate is too coarse for access-control queries that explicitly target a non-replicated storage. In a server that has both replicated access storage and a local storage, CREATE USER u IN local_directory (or DROP USER u FROM local_directory) enters this branch, sees that some replicated access storage exists, and leaves cluster empty. The query then runs only on the initiator even though explicit CREATE USER u IN local_directory ON CLUSTER <cluster> is valid and would create the user in the local storage on every host.

That breaks the setting's contract for an eligible statement and can leave access configuration divergent across the configured cluster. Please base the skip on the target storage when storage_name is specified (for example, getStorageByName(...)->isReplicated()), and keep the broad conservative check only for statements whose target storage is implicit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This predicate is intentionally identical to the canonical removeOnClusterClauseIfNeeded (isAccessControlQuery + getAccessControl().containsStorage(ReplicatedAccessStorage::STORAGE_TYPE)), which is the single source of truth for which ON CLUSTER access queries are considered managed by replicated storage. Making auto-fill finer-grained than that — resolving an explicit storage_name and checking only that storages isReplicated— would diverge from the canonical function: for consistencyremoveOnClusterClauseIfNeededwould have to gain the same per-storage resolution, otherwise an explicitCREATE USER u IN local_directory ON CLUSTER ...would still be stripped back to local (whenignore_on_cluster_for_replicated_access_entities_queries` is enabled) even though auto-fill had added it.

Note the failure mode here is conservative: the query stays local and the user can still add ON CLUSTER explicitly — unlike the temporary-table finding, which actively mis-routed a valid local statement through distributed DDL. It also cannot be exercised by a plain stateless .sql: it requires a server configured with both a replicated and a local access storage.

So whether to make the access-storage predicate target-aware (and align removeOnClusterClauseIfNeeded with it for consistency) is a behavior/design decision rather than a mechanical parity fix. Flagging for @alexey-milovidov / @JingYanchao to decide; leaving this thread open.

…ary table

`ALTER TABLE t` and `OPTIMIZE TABLE t` with an omitted database resolve to a
session-local temporary table even without the `TEMPORARY` keyword: the
interpreters resolve the unqualified name via `Context::ResolveAll`, and the
parser does not set `isTemporary` for them. The automatic `ON CLUSTER` fill in
`executeQueryImpl` only checked the parser flag in the `ASTQueryWithTableAndOutput`
branch, so with the feature enabled such a statement got a cluster filled and was
routed through distributed DDL, leaving the session table untouched.

Reuse the existing `resolves_to_temporary` helper (hoisted out of the `DROP`
branch) for the `ALTER`/`OPTIMIZE` branch too, guarding on a present table so a
database-level statement does not construct an empty `StorageID`. Extend
`03789_automatic_fill_on_cluster_mode` with a regression covering `ALTER TABLE`
and `OPTIMIZE TABLE` on a session temporary table.

Addresses the AI review finding on `executeQuery.cpp` (temporary-table guard).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

After the master-merge earlier today, the AI review re-ran on the merged head and flipped to Request changes with two findings on executeQuery.cpp. Pushed fb448604f76..e6d709e6f9e.

Fixed: temporary-table guard for ALTER/OPTIMIZE (e6d709e6f9e). ALTER TABLE t and OPTIMIZE TABLE t with an omitted database resolve to a session-local temporary table even without the TEMPORARY keyword (the parser does not set isTemporary for them), so with auto-fill enabled the ASTQueryWithTableAndOutput branch filled a cluster and routed the operation through distributed DDL, leaving the session table untouched. The branch now applies the same resolves_to_temporary helper the DROP/TRUNCATE branch already used (hoisted to be shared): when the database is omitted and the unqualified name resolves via Context::ResolveExternal, the statement is treated as temporary and stays local. The check is guarded on a present table name so a database-level ALTER does not construct an empty StorageID. Regression added to 03789_automatic_fill_on_cluster_mode (Test 10): ALTER TABLE ... ADD COLUMN and OPTIMIZE TABLE on a session temporary MergeTree table, asserting the column is added to the session table and the statement carries no ON CLUSTER. Thread resolved.

Left open for your call: the access-storage predicate (executeQuery.cpp:1517). The bot wants auto-fill to resolve an explicit storage_name (e.g. CREATE USER u IN local_directory) and skip only when that storage is replicated, instead of skipping whenever any replicated access storage exists. The current predicate is intentionally byte-identical to the canonical removeOnClusterClauseIfNeeded — the single source of truth for which ON CLUSTER access queries are managed by replicated storage. Going finer-grained would diverge from it: for consistency removeOnClusterClauseIfNeeded would have to gain the same per-storage resolution, otherwise an explicit CREATE USER ... IN local_directory ON CLUSTER ... would still be stripped back to local (with ignore_on_cluster_for_replicated_access_entities_queries on) even though auto-fill added it. The failure mode is also conservative (the query stays local; the user can still add ON CLUSTER), unlike the temporary-table case which actively mis-routed a valid statement, and it cannot be covered by a plain stateless .sql (needs a server with both a replicated and a local access storage). So this is a behavior/design decision, not a mechanical parity fix — left the thread open for you / @JingYanchao.

Verification. The shared build directory here is in a divergent transitional state (a 4k-file rebuild), so I did not do a full local build; the change was verified with clang -fsyntax-only against the build's compile command (clean), the Test 10 SQL mechanics were checked against a running server (temporary MergeTree supports ALTER ADD COLUMN and OPTIMIZE, the default backfills), and the fix is a direct reuse of the already-CI-tested DROP/TRUNCATE temporary-table path (Test 8). The fresh CI run on e6d709e6f9e exercises Test 10 against the real PR binary.

CI on the prior head (fb448604f76) was green apart from two unrelated master-wide flakes: test_keeper_disks/test.py::test_snapshots_with_disks (assert 2 == 1 on snapshot file count) and Upgrade check (amd_release) (a MutatePlainMergeTreeTask CANNOT_PARSE_TEXT casting 'x' to UInt64 on a cached_s3 part) — neither can be affected by a default-off experimental setting that only touches the DDL auto-fill path.

No approval added — not my PR.

Comment thread src/Interpreters/executeQuery.cpp Outdated
@clickhouse-gh

clickhouse-gh Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.40% 85.40% +0.00%
Functions 92.60% 92.60% +0.00%
Branches 77.60% 77.60% +0.00%

Changed lines: Changed C/C++ lines covered by tests: 115/127 (90.55%) | Lost baseline coverage (was covered on master, now uncovered in this PR): 1 line(s) · Uncovered code

Full report · Diff report

…porary

The temporary-table resolution heuristic for the automatic `ON CLUSTER`
fill was applied to every `ASTQueryWithTableAndOutput`, including
`ASTCreateQuery`. A persistent `CREATE TABLE t` / `CREATE VIEW t` creates
a new object in the current database; it does not target a session
temporary table just because one named `t` already shadows it (a
temporary and a persistent table of the same name can coexist).
`InterpreterCreateQuery` treats a non-`TEMPORARY` create as persistent
and fills the current database, so it is an eligible statement that
should be distributed `ON CLUSTER`. With the heuristic applied, a
shadowing temporary table made auto-fill leave `cluster` empty and the
persistent object was created only on the initiator.

Exclude `ASTCreateQuery` from the resolution heuristic. A
`CREATE TEMPORARY TABLE` is still kept local because `isTemporary` is
already checked above. The remaining statements that reach this branch
(`ALTER`, `OPTIMIZE`, `UPDATE`, `DELETE`, `CREATE`/`DROP INDEX`) operate
on an existing table by name, so the heuristic remains correct for them.

Add Test 11 to `03789_automatic_fill_on_cluster_mode`: a persistent
`CREATE TABLE` and `CREATE VIEW` shadowed by same-named temporary tables
are rewritten `ON CLUSTER` and the persistent objects are created.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Addressed the AI review Request changes finding on executeQuery.cpp. Pushed 0cd5eb3e862..65483540c47.

Fixed: persistent CREATE shadowed by a temporary table (65483540c47). The temporary-table resolution heuristic added for the implicit ALTER/OPTIMIZE/DROP case was applied to every ASTQueryWithTableAndOutput, including ASTCreateQuery. A persistent CREATE TABLE t / CREATE VIEW t creates a new object in the current database (a temporary and a persistent table of the same name can coexist), so it does not target a session temporary table just because one named t shadows it. With auto-fill enabled, the guard misclassified such a create as temporary, left cluster empty, and created the object only on the initiator instead of distributing an eligible CREATE ON CLUSTER. The fix excludes ASTCreateQuery from the heuristic; CREATE TEMPORARY TABLE is still kept local via the isTemporary check above, and the remaining statements that reach this branch (ALTER, OPTIMIZE, UPDATE, DELETE, CREATE/DROP INDEX) operate on an existing table by name, so the heuristic stays correct for them.

Added Test 11 to 03789_automatic_fill_on_cluster_mode: a persistent CREATE TABLE and CREATE VIEW shadowed by same-named temporary tables are rewritten ON CLUSTER, and the persistent objects are created.

Verification. -fsyntax-only of executeQuery.cpp against the real compile commands is clean. The temporary/persistent coexistence and DROP ordering used by the test were checked on a server, and the query_log LIKE patterns were confirmed to match the formatted ON CLUSTER queries (and not to self-match the verification SELECTs). The ON CLUSTER injection path itself is the same one already exercised by Test 2/Test 3; Test 11 validates it end-to-end in CI.

The remaining open thread (access-storage predicate at executeQuery.cpp:1529) is left as a design decision as discussed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This access-entity classification does not cover ASTMoveAccessEntityQuery, so MOVE USER / MOVE ROLE / similar access-entity moves remain eligible for auto-fill even when replicated access storage is configured.

On a server with both replicated access storage and a local storage, MOVE USER u TO local_directory enters this block with an empty cluster, gets ON CLUSTER injected, and then InterpreterMoveAccessEntityQuery takes the distributed DDL branch. Each worker runs moveAccessEntities, which removes the entity from the source storage before inserting it into the destination; when the source is replicated storage, one worker can delete the shared ZooKeeper entity before the others resolve it, or only one host can end up with the local copy. That turns an ordinary local access-maintenance statement into a distributed operation that can fail or leave access storage divergent.

Please classify ASTMoveAccessEntityQuery before assigning cluster: either skip it whenever replicated access storage is present, or resolve the source and destination storages and auto-fill only when the move is safe to run independently on every host.

The last verification of Test 11 is written as

    SELECT '...both persistent objects exist in the current database', count() = 2 FROM system.tables ...

so its second column is the boolean `count() = 2`, which is `1` when the two
persistent objects exist. The reference, however, recorded `2` (as if the query
selected the raw `count()`), so the test failed with

    -Test 11 verification: both persistent objects exist in the current database	2
    +Test 11 verification: both persistent objects exist in the current database	1

The feature is correct: with `allow_experimental_automatic_fill_on_cluster_mode`
both the persistent `CREATE TABLE` shadowed by a temporary table and the
persistent `CREATE VIEW` are rewritten `ON CLUSTER` and both persistent objects
are created (verified against a local server with `test_shard_localhost` and
embedded Keeper: raw `count()` is `2`, so `count() = 2` yields `1`). Only the
reference was wrong.

Fix the reference to `1`, matching the actual output and the test's own
convention for combined label + boolean rows (Test 10, e.g. `count() = 2` -> `1`).

CI report: ClickHouse#94138

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-experimental Experimental Feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants