iframe-proxy

groeneai · 2026-06-25T12:59:43Z

Changelog category (leave one):

Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fixed a row-policy escape: RENAME TABLE, EXCHANGE TABLES and RENAME DATABASE left row policies bound to the old name, so after the operation the table was readable with no filtering under its new name. Row policies now follow the table/database on rename.

Description

Row policies are access entities keyed by (database, table_name) (RowPolicyName). When a table was renamed the policy stayed bound to the old name and was never reattached to the new name, so after the rename ALL rows became visible under the new name, fully escaping the policy. EXCHANGE TABLES and RENAME DATABASE (for ON db.* and per-table policies) had the same effect.

InterpreterRenameQuery::executeToTables already carefully re-keys every other kind of table binding from the old StorageID to the new one (referential/loading/MV dependencies, source-view dependencies, named collections), but never touched row policies. This PR re-keys row policies the same way, through AccessControl, so the policy follows the data:

plain RENAME TABLE a TO b moves policies a -> b;
EXCHANGE TABLES a AND b swaps both directions. The policy follows the data, which is the only behaviour that preserves protection across the swap (otherwise exchanging a protected table with an unprotected one would move the protected data to a name with no policy);
cross-database RENAME db1.a TO db2.b updates both database and table;
RENAME DATABASE moves both database-wide (ON db.*) and per-table policies;
on a failed rename the original bindings are restored in the existing catch block, consistent with how dependencies are already restored.

The re-key is applied collision-free by routing each affected policy through a unique temporary table name first, then to its destination, so an EXCHANGE of two tables that each have a policy with the same short name does not hit a transient name collision in the access storage. Updates go through AccessControl::update by UUID, so they persist on disk and replicate like any other access-entity change.

Reproduction (100% on a clean checkout, with access_management enabled):

CREATE TABLE data (id UInt64, secret String, dept String) ENGINE = MergeTree ORDER BY id;
INSERT INTO data VALUES (1, 'public', 'eng'), (2, 'classified', 'fin'), (3, 'public2', 'eng');
CREATE USER restricted;
GRANT SELECT, DROP TABLE ON default.data TO restricted;
GRANT SELECT, INSERT, CREATE TABLE ON default.data2 TO restricted;
CREATE ROW POLICY rp ON default.data FOR SELECT USING dept = 'eng' TO restricted;
-- as restricted:
SELECT * FROM data;            -- only the 2 eng rows: policy applied
RENAME TABLE data TO data2;
SELECT * FROM data2;           -- before this PR: ALL 3 rows, policy escaped

Regression test 04401_row_policy_follows_table_rename covers RENAME, EXCHANGE (including same-short-name and asymmetric cases), cross-database RENAME, RENAME DATABASE, and the rollback-on-failure path.

Notes for reviewers (cc access-control maintainers)

Re-key vs block. This implements the re-key approach (policy follows the data). The alternative would be to block RENAME/EXCHANGE while active policies exist. Re-keying preserves protection without breaking legitimate workflows (e.g. tenants renaming their own tables), so it seems clearly preferable, but please confirm this is the behaviour you want.
Severity / changelog category. Filed as Bug Fix. This is an RBAC escape, so it may warrant Critical Bug Fix (crash, data loss, RBAC) instead. Happy to switch the category if you agree.
Replicated databases. On a Replicated database the rename is enqueued and each replica re-runs the interpreter, so the re-key runs once per replica. Because it targets policies by UUID with absolute destination names, repeated application is idempotent; the redundant access-storage writes mirror how the existing dependency re-keying already behaves per-replica.

…ABASE Row policies are keyed by (database, table). On RENAME TABLE, EXCHANGE TABLES and RENAME DATABASE the policy was left bound to the old name and never reattached, so after the operation the data was readable with no filtering under the new name, fully escaping the policy. InterpreterRenameQuery::executeToTables already re-keys table dependencies, source-view dependencies and named collections from the old StorageID to the new one, but never touched row policies. This change re-keys row policies the same way, through AccessControl, so the policy follows the data: - plain RENAME TABLE a TO b moves policies a -> b; - EXCHANGE TABLES a AND b swaps both directions (the policy follows the data, which is the only behaviour that preserves protection across the swap); - cross-database RENAME db1.a TO db2.b updates both database and table; - RENAME DATABASE moves both database-wide (ON db.*) and per-table policies; - on a failed rename the original bindings are restored in the catch block, consistent with how dependencies are already restored. The re-key is applied collision-free by routing each affected policy through a unique temporary table name first, then to its destination, so an EXCHANGE of two tables that each have a policy with the same short name does not hit a transient name collision in the access storage. Updates go through AccessControl::update by UUID, so they persist on disk and replicate like any other access-entity change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

groeneai · 2026-06-25T13:00:12Z

Pre-PR validation gate (click to expand)

#	Question	Answer
a	Deterministic repro?	Yes. On a clean checkout with `access_management`: create table + policy, `RENAME TABLE data TO data2` as the restricted user, then `SELECT` shows all rows (policy escaped). 100% reproducible; same for `EXCHANGE TABLES` and `RENAME DATABASE`.
b	Root cause explained?	Yes. Row policies are access entities keyed by `(database, table_name)` (`RowPolicyName`). `InterpreterRenameQuery` re-keys table dependencies / source-view deps / named collections from old to new `StorageID`, but never row policies, so after a rename the policy stays orphaned on the old name and the table is unfiltered under its new name.
c	Fix matches root cause?	Yes. The fix re-keys row policies from old to new `(database, table)` through `AccessControl`, in the same place and the same way the other bindings are already re-keyed (incl. rollback in the existing `catch`). It targets the exact missing re-key, not a symptom.
d	Test intent preserved / new tests added?	New regression test `04401_row_policy_follows_table_rename` added, covering RENAME, EXCHANGE (symmetric, same-short-name, asymmetric), cross-DB RENAME, RENAME DATABASE, and rollback-on-failure. No existing test weakened.
e	Both directions demonstrated?	Yes. The escape reproduces on the pristine pre-fix binary; with the fix the policy stays applied in every case. Regression test passes 30/30 with full CI randomization and 8/8 across 4 parallel workers.
f	Fix is general across code paths?	Yes. All four rename code paths are covered: plain RENAME, EXCHANGE (both directions), cross-DB RENAME (`executeToTables`), and RENAME DATABASE incl. `ON db.*` (`executeToDatabase`). The same re-key helper handles all of them; failure rollback restores the original binding.
g	Fix generalizes across inputs?	Yes. Policies are matched/moved by `(database, table)` regardless of filter expression or roles; database-wide (`ON db.*`, `table_name == ANY_TABLE_MARK`) and per-table policies both handled; EXCHANGE with identical policy short names on both tables is collision-safe (two-phase move via a unique temp name).
h	Backward compatible?	Yes. No setting default, on-disk/wire format, or metadata format change. Behaviour change is a security fix (the policy now correctly follows the table); no `SettingsChangesHistory` change needed. (Flagged to maintainers in the PR whether to classify as Critical Bug Fix.)
i	Invariants and contracts preserved?	Yes. Re-keying goes through `AccessControl::update` by UUID (same path `ALTER ROW POLICY` uses), so it persists and replicates. Access-storage name-collision invariant is upheld via the temp-name two-phase move. No new lock-ordering: the access-storage mutex is independent of the per-table DDLGuard.

Session id: cron:clickhouse-worker-slot-13:20260625-112800

groeneai · 2026-06-25T13:00:33Z

cc @pufit @vitlibar — could you review this? Row policies are keyed by (database, table) and InterpreterRenameQuery never re-keyed them, so RENAME TABLE / EXCHANGE TABLES / RENAME DATABASE left the policy orphaned on the old name and the table became readable unfiltered under the new name (an RBAC escape). This re-keys row policies through AccessControl alongside the existing dependency re-keying, with collision-safe handling for EXCHANGE and rollback on failure. Two design questions for you are noted in the PR description (re-key vs block, and whether this should be Critical Bug Fix).

clickhouse-gh · 2026-06-25T13:05:39Z

Workflow [PR], commit [0f8ce6e]

Summary: ✅

Performance Comparison: Performance dashboard

AI Review

Summary

This PR rekeys row policies so explicit RENAME TABLE, EXCHANGE TABLES, and RENAME DATABASE do not leave protected data unfiltered under a new name. The direct rename/exchange cases and prior issues around cross-database db.* policies and replacement DDL are now addressed, but the fix still has unresolved atomicity gaps around access-entity updates, including a replicated-access divergence, so I cannot approve it yet.

PR Metadata

💡 The selected category should be Critical Bug Fix (crash, data loss, RBAC) rather than Bug Fix (user-visible misbehavior in an official stable release) because the PR fixes an RBAC escape. The changelog entry itself is specific and user-readable.

Findings

❌ Blockers

[src/Interpreters/InterpreterRenameQuery.cpp:489] Replicated database table rename and ReplicatedAccessStorage row-policy rename are not coordinated. The first replica to execute the DDL updates the shared row-policy entity to the new table name, so lagging replicas can still have the old table name but no old_db.old_table policy. This temporarily exposes rows unfiltered until their DDL worker catches up. Fix by coordinating the access update with replicated database DDL, keeping both names protected through the transition, or rejecting affected policies in replicated access storage / Replicated databases.
[src/Interpreters/InterpreterRenameQuery.cpp:489] [dismissed by author -- https://github.com/ClickHouse/ClickHouse/pull/108478#discussion_r3475937537] The preflight snapshot is still not protected from concurrent access DDL. CREATE ROW POLICY can target non-existing table names and does not take the table DDLGuard, so a policy can be created on the source after collection or a conflicting policy can be created at the destination/transient name before applyRowPolicyRekeys; the table/database rename has already committed by then, so the policy can be left on the wrong name or the apply can throw after the commit. I still consider this real because the current code has no lock or version check spanning collection, preflight, metadata rename, and access update.

Tests

⚠️ Missing a focused integration test for Replicated database with <user_directories><replicated> where one replica lags the DDL rename; the test should prove the old-name table stays filtered on the lagging replica, or that the operation is rejected. The new stateless test explicitly opts out of replicated databases, so it cannot catch this path.
⚠️ The concurrent access-DDL window also needs coverage or a design reason it is impossible: create a policy on the source/destination/transient name between preflight and applyRowPolicyRekeys and verify no committed rename leaves data unfiltered.

Final Verdict

Status: ❌ Block

Minimum required actions: close the replicated-access / Replicated database policy exposure, make the concurrent access-DDL window atomic or reject affected renames before commit, and update the changelog category to Critical Bug Fix (crash, data loss, RBAC).

clickhouse-gh · 2026-06-25T13:09:26Z

                NamedCollectionFactory::instance().renameDependencies(to_table_id, from_table_id);
+
+            /// Re-key row policies last: it has its own rollback, and nothing after it can throw.
+            applyRowPolicyRekeys(access_control, row_policy_rekeys);


applyRowPolicyRekeys can still throw here after database->renameTable has already committed. For example, row policies loaded from users.xml are read-only, and a policy with the same short name can already exist on the destination table name even when the table name itself is free. In those cases the outer catch restores dependencies but cannot undo the table rename; the row policy is restored/left on the old name and the renamed table is left readable without the intended filter. executeToDatabase has the same post-commit failure mode after db->renameDatabase. Please preflight all re-key failures before the table/database rename, or block the rename before the commit point when the affected policies cannot be moved.

The read-only and final-destination cases are preflighted now, but the post-commit exception is still reachable through the phase-1 temporary name. applyRowPolicyRekeys first updates the policy to short_name ON old_db.\.tmp_rename_row_policy__0`afterrenameTablehas returned; if that row policy name already exists,AccessControl::updatethrows there. The UUID is visible insystem.row_policies.id, so this is deterministic, not just a concurrent race. The outer catch` still cannot undo the committed table/database rename, so the policy remains on the old name and the renamed table can be exposed without the intended filter. Please preflight or reserve the temporary names as well, or avoid a post-commit rename step that can still hit an access-entity name collision.

The current code still has the same post-commit exposure class. The snapshot/preflight at src/Interpreters/InterpreterRenameQuery.cpp:410-418 is not protected against concurrent access DDL, while applyRowPolicyRekeys still performs ordinary access-storage writes after renameTable has committed at src/Interpreters/InterpreterRenameQuery.cpp:478.

Two concrete interleavings remain possible because CREATE ROW POLICY is independent of the table DDLGuard and can target non-existing table names:

another session creates a new policy on the source table after collectRowPolicyRekeys but before the rename commits; it is not in row_policy_rekeys, so it stays bound to the old name after the table moves;

another session creates a conflicting policy on the destination or transient name after preflightRowPolicyRekeys; the post-commit AccessControl::update throws, restore puts the moving policy back on the old name, and the already-renamed table is left without that filter.

Storage-write exceptions in AccessControl::update have the same shape: preflight cannot prove the later persisted updates will succeed. This needs serialization/reservation that covers access-entity changes until the row-policy re-key is complete, or a fail-closed design that does not leave protected data at a name where the original policy is not applied when a post-commit access update fails.

The row-policy re-key in InterpreterRenameQuery previously ran after database->renameTable / db->renameDatabase had already committed. If the re-key threw at that point (a policy in a read-only storage such as one loaded from users.xml, or a destination short-name collision), the surrounding catch could restore the dependency bookkeeping but could not undo the committed table/database rename, leaving the renamed object readable without its row filter. That reintroduced the very row-policy escape this change closes, on the error path. Move the re-key feasibility check ahead of the first mutation: collect the planned re-keys and preflight them (reject the rename when a policy is in a read-only storage or its destination name is already taken by a policy that is not itself moving). A rejected rename now changes nothing. The post-rename apply step keeps its own rollback for any residual error. Add regression coverage for both unmovable-policy paths: a read-only users.xml policy (via clickhouse-local) and a destination row-policy name collision. Both must fail the rename cleanly, leaving the table in place and the policy still filtering. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

groeneai · 2026-06-25T14:31:36Z

Addressed the post-commit failure path in 569f980.

The re-key now runs in two stages. preflightRowPolicyRekeys runs BEFORE the first mutation (removeDependencies / renameTable in executeToTables, renameDatabase in executeToDatabase) and rejects the rename when a policy cannot be moved: it is in a read-only storage (e.g. loaded from users.xml), or its destination short name is already taken by a policy that is not itself moving (an EXCHANGE of two same-short-name policies is not a collision because both are in the re-key set). A rejected rename changes nothing. The apply step still runs after the rename with its existing rollback for any residual error.

Regression tests cover both unmovable-policy paths: a read-only users.xml policy (via clickhouse-local) and a destination row-policy name collision. Each fails the rename cleanly, leaving the table in place and the policy still filtering.

Follow-up to the preflight commit, addressing two review findings. 1. Cross-database db.* policy. collectRowPolicyRekeys matched only an exact table_name, so a database-wide `CREATE ROW POLICY ... ON db1.*` was not carried when a table moved out of db1 (RENAME db1.t TO db2.t, or cross-db EXCHANGE). EnabledRowPolicies::getFilter then looks up db2.t and db2.*, never the old db1.*, so the moved data was readable unfiltered or under an unrelated destination policy. A db.* policy governs every table in db1 and cannot follow a single table out, so reject such cross-database moves/exchanges (NOT_IMPLEMENTED) instead of silently dropping the filter. Same-database renames are unaffected: the db.* policy keeps covering the table via the ANY_TABLE_MARK fallback. 2. Phase-1 transient name. applyRowPolicyRekeys parks each policy under `.tmp_rename_row_policy_<uuid>_<i>` before moving it to the destination, and that step runs after renameTable/renameDatabase has committed. The name is derived from the policy UUID (visible in system.row_policies.id), so a pre-existing policy can occupy it deterministically; AccessControl::update would then throw post-commit and the outer catch cannot undo the committed rename, leaving the table unfiltered. Extend preflightRowPolicyRekeys to also reject when the transient name is taken by a non-moving policy, so the whole re-key is proven applicable before the rename commits. The temp-name format is factored into tempRekeyTableName so preflight and apply cannot drift. Regression tests added to 04401 for both: cross-database RENAME with a db.* policy is rejected (table not moved, source still filtered), and a transient-name collision is rejected pre-commit (table stays put, policy still filters). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

groeneai · 2026-06-25T16:15:41Z

Both findings fixed in 779feb3.

Cross-database db.* drop: a database-wide ON db.* policy governs every table in db and cannot follow a single table out of it, so RENAME db1.t TO db2.t (and cross-db EXCHANGE) are now rejected with NOT_IMPLEMENTED when a source db.* policy applies, instead of silently dropping the filter. Same-database renames are unaffected (the db.* policy still covers the table via the ANY_TABLE_MARK fallback).
Phase-1 temp-name post-commit throw: preflightRowPolicyRekeys now also checks the .tmp_rename_row_policy_<uuid>_<i> parking name for a non-moving-policy collision, so the entire re-key is proven applicable before the table/database rename commits. The temp-name format is factored into tempRekeyTableName so preflight and apply cannot drift.

Regression tests added to 04401: cross-db db.* move is rejected (table not moved, source still filtered); a transient-name collision is rejected pre-commit (table stays put, policy still filters). 30/30 pass with full randomization; the existing access/rename suite still passes.

groeneai · 2026-06-25T16:16:15Z

Pre-PR validation gate for the follow-up commit 779feb3 (two review findings).

Pre-PR validation gate (click to expand)

#	Question	Answer
a	Deterministic repro?	Yes. Cross-DB: `CREATE ROW POLICY p ON db1.* ...; RENAME TABLE db1.t TO db2.t2` (and cross-db `EXCHANGE`) leaked the moved data on pre-fix code; now rejected `NOT_IMPLEMENTED` on demand. Temp-name: create `tp ON db.t`, read its UUID from `system.row_policies.id`, create a second `tp ON db.\`.tmp_rename_row_policy__0``, then` RENAME db.t TO db.b`throws post-commit pre-fix; now rejected`ACCESS_ENTITY_ALREADY_EXISTS` pre-commit. Both reproduced and verified on a Build-ID-matched binary (9bbf43a3).
b	Root cause explained?	(1) `collectRowPolicyRekeys` matched only exact `table_name`, so a database-wide `ON db1.` (table_name = ANY_TABLE_MARK) was never collected; after the move, `EnabledRowPolicies::getFilter` looks up `db2.t` then `db2.`, never `db1.*`, so the moved data is unfiltered. (2) `applyRowPolicyRekeys` parks each policy under `.tmp_rename_row_policy_<uuid>_<i>` AFTER `renameTable` commits; the UUID is user-visible, so a pre-existing policy occupying that name makes `AccessControl::update` throw post-commit, and the outer catch cannot undo the committed rename -> table left unfiltered.
c	Fix matches root cause?	Yes. (1) A `db.` policy governs every table in `db1` and cannot follow one table out, so cross-database moves/exchanges are rejected when a source `db.` applies (`hasDatabaseWideRowPolicy`), addressing the exact unmovable-policy mechanism. (2) `preflightRowPolicyRekeys` now also checks the phase-1 transient name, proving the whole re-key applies BEFORE the rename commits -> no post-commit throw. No symptom-masking.
d	Test intent preserved / new tests added?	Yes. Two regression cases added to `04401`: cross-db `db.*` move rejected (table not moved, source still filtered) and transient-name collision rejected pre-commit (table stays, policy still filters). Existing cases unchanged.
e	Both directions demonstrated?	Yes. Both vectors leak / throw-post-commit on pre-fix code; both are cleanly rejected with the fix and the table+policy stay intact. `04401`: 30/30 with full randomization.
f	Fix is general across code paths?	Yes. The temp-name preflight lives in the shared `preflightRowPolicyRekeys`, so both `executeToTables` (RENAME/EXCHANGE) and `executeToDatabase` (RENAME DATABASE) inherit it. The cross-db `db.` check is in `executeToTables`; `executeToDatabase` does not need it because `collectRowPolicyRekeysForDatabase` already moves `db.` to `to_db.*`.
g	Fix generalizes across inputs?	Yes. Covered: plain RENAME, EXCHANGE (both directions), same-short-name EXCHANGE (still succeeds via the moving-id guard), cross-DB RENAME and EXCHANGE, RENAME DATABASE with `db.` and per-table policies, read-only (users.xml) source, destination collision, transient-name collision, and the no-policy no-op. `db.` is identified by `RowPolicy::isForDatabase()` (table_name == ANY_TABLE_MARK), independent of short name.
h	Backward compatible?	Yes. No setting, on-disk, wire, or metadata format change. The cross-db `db.*` rejection is a new, fail-closed error on a previously-silent escape (a security fix, not a behavior regression for valid use); same-database renames and all single-database policy moves are unaffected. No `SettingsChangesHistory` entry needed.
i	Invariants and contracts preserved?	Yes. The "every re-key proven applicable before the first mutation" invariant is strengthened (transient-name case added). The rename's existing dependency rollback is untouched; `applyRowPolicyRekeys` keeps its own rollback for residual errors. Policy updates still go through `AccessControl::update` by UUID (the ALTER ROW POLICY path), so persistence/replication are unchanged. No new throw after the commit point.

Session id: cron:clickhouse-worker-slot-3:20260625-151900

groeneai · 2026-06-25T16:56:44Z

Confirmed, and this is now an atomicity question rather than a missing guard. The remaining exposure is structural to InterpreterRenameQuery, not specific to the row-policy re-key: the dependency, source-view and named-collection re-keys at lines 449-473 also run after renameTable commits, and the catch restores only the in-memory dependency bindings, not the committed rename, so a post-commit throw there leaves the same committed-rename / inconsistent-binding shape. Access-entity DDL is not under the table DDLGuard (CREATE ROW POLICY can target a non-existent name), so preflight cannot be made atomic with the commit by adding more checks. The original escape (a policy silently not following the table) is closed; what is left is hardening this race.

I do not want to pick the locking model unilaterally on a security PR, so flagging the fork for @pufit @vitlibar to decide:

Serialize: have the rename hold a reservation in AccessControl over the affected policy names for the commit window, and take the snapshot under it. Closes all three interleavings (concurrent create on the source, destination/transient-name collision, storage-write failure). Most invasive; needs a new AccessControl/IAccessStorage entry point.
Fail-closed ordering: re-key policies before the rename commit and roll them back if the rename then fails. Not a clean close on its own: it opens a transient window where the old name is unfiltered, and it still misses a policy created on the source after the snapshot.
Block: reject RENAME/EXCHANGE/RENAME DATABASE while an affected row policy exists (the fallback from the original report). Trivially fail-closed (data never lands at an unfiltered name), much smaller, but refuses the rename whenever a policy is present.

My read: option 1 is the only complete close but is an access-subsystem change; option 3 is the smallest fail-closed option. Is fully closing this race in scope for this patch, or acceptable as a follow-up given it matches the existing post-commit re-key semantics for the other bindings? Will implement whichever you prefer.

The regression test grew across review rounds to eight cases (RENAME, EXCHANGE, cross-DB rename, RENAME DATABASE, and the destination/read-only/cross-DB/transient-name rejection paths), two of which start a separate clickhouse-local server. A single run is cheap, but under the flaky-check job's repeated parallel runs on sanitizer builds the per-run wall clock exceeds the 180s soft cap, so the flaky check fails it with TOO_LONG. The long tag exempts the test from that 180s flaky-check cap and reduces its flaky-check repeat count from ~50 to ~5 (long_test_runs_ratio), while still running it once on every regular and per-test-coverage lane (only the LLVM aggregation and fast-test lanes pass --no-long). No assertion is weakened: each run still executes all eight security cases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

groeneai · 2026-06-25T17:49:49Z

The flaky check failed this PR's own test 04401_row_policy_follows_table_rename with TOO_LONG (> 180s). The test grew to eight cases across review rounds (two start a separate clickhouse-local server), so under the flaky check's repeated parallel runs on sanitizer builds a single run crosses the 180s soft cap.

Tagged the test long in bd4af75. That exempts the 180s flaky-check cap and cuts its flaky-check repeat count from ~50 to ~5, while the test still runs once on every regular and per-test-coverage lane (only the LLVM aggregation and fast-test lanes pass --no-long). No assertion is weakened: each run executes all eight security cases. Verified locally: flaky check now schedules 5 runs (was 50), all pass, each ~15s.

The outstanding atomicity/TOCTOU design question from the 16:19 UTC review is unaffected and still awaits a maintainer's pick on the locking-vs-fail-closed fork.

The row-policy re-key added for RENAME/EXCHANGE also fired for the synthetic exchange that CREATE OR REPLACE TABLE / REPLACE TABLE build internally (InterpreterCreateQuery::doCreateOrReplaceTable), where from = a freshly built temporary table and to = the existing target. The unconditional destination re-key moved the target table's row policy onto that temporary name, which is dropped right after the swap, leaving the replaced table readable with no filter. That reintroduced the row-policy escape this code closes, for replacement DDL. Mark the synthetic rename with an internal create_or_replace flag and skip the row-policy re-key for it: only the target's storage is replaced, the target keeps its name, so the policies bound to that name stay put and filter the new data. Explicit RENAME/EXCHANGE still re-key as before. The flag is rebuilt in-process on every replica via doCreateOrReplaceTable, so it is preserved for Replicated databases and ON CLUSTER. Covered by two new cases in 04401_row_policy_follows_table_rename: an existing protected table replaced via CREATE OR REPLACE TABLE and via REPLACE TABLE, each asserting the policy stays bound to the target name and keeps filtering. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

groeneai · 2026-06-25T18:54:02Z

Fixed in 41e138b. CREATE OR REPLACE TABLE / REPLACE TABLE reach the rename via the synthetic exchange built by InterpreterCreateQuery::doCreateOrReplaceTable (from = the temporary replacement table, to = the existing target), and the unconditional to_rekeys collection moved the target table's row policy onto that temporary name, which is dropped right after the swap, leaving the replaced table unfiltered.

I added an internal create_or_replace flag on the synthetic ASTRenameQuery (set for both the rename and exchange branches in doCreateOrReplaceTable) and skip the row-policy re-key for it in executeToTables: only the target's storage is replaced, the target keeps its name, so the policies bound to that name stay put and filter the new data. Explicit RENAME/EXCHANGE re-key exactly as before. The flag is rebuilt in-process on every replica via doCreateOrReplaceTable, so it is preserved for Replicated databases and ON CLUSTER.

Verified on a debug build: your CREATE OR REPLACE TABLE reproducer now returns only the filtered row (was unfiltered), REPLACE TABLE likewise, and plain RENAME/EXCHANGE still move policies with the data. Regression coverage added to 04401_row_policy_follows_table_rename for an existing protected table replaced via both CREATE OR REPLACE TABLE and REPLACE TABLE, each asserting the policy stays on the target name and keeps filtering.

The CREATE OR REPLACE TABLE / REPLACE TABLE cases run as the restricted user and specify ENGINE = MergeTree. The default test config enables table_engines_require_grant, so they need an explicit TABLE ENGINE grant; without it the statement is rejected with ACCESS_DENIED. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

groeneai · 2026-06-25T21:29:51Z

Fixed the deterministic 04401 failure on the CREATE OR REPLACE / REPLACE TABLE cases (commit 3f8b2dd).

Root cause: those two cases run as the restricted test user and specify ENGINE = MergeTree. The default test config sets table_engines_require_grant, so the user needs GRANT TABLE ENGINE ON MergeTree (it only had CREATE/INSERT/SELECT/DROP). Without it the statement is rejected with Code: 497 ... TABLE ENGINE ON MergeTree (ACCESS_DENIED). The earlier local run passed under a minimal config that did not enforce engine grants.

Fix: add GRANT TABLE ENGINE ON MergeTree TO ${USER}. Scope is minimal: only these two statements specify an engine as the restricted user (every other CREATE TABLE runs as admin). The grant governs engine specification only and is orthogonal to row-policy filtering, so the security assertions are unchanged: after the replace, the restricted user still sees only the eng row, and the policy stays bound to the target name.

Verified against the default test config (table_engines_require_grant=true): without the grant the case fails with the exact TABLE ENGINE ON MergeTree ACCESS_DENIED; with it the replace succeeds and the policy still filters. Full test passes 10/10 with randomization.

groeneai · 2026-06-26T05:31:14Z

CI finish ledger — `3f8b2dd`

Every failure below has an owner: a fixing PR (ours or external), or CH Inc sync (exempt).

Check / test	Reason	Owner / fixing PR
Stateless tests (amd_tsan, s3 storage, parallel, 2/2) / ThreadSanitizer data race (STID 4071-3348) + Server died	trunk TSAN race in `QueryStatus`/`MemoryReservation` release at query finish, unrelated to this row-policy change (7 master + ~15 unrelated PRs/30d)	#108391 (merged 2026-06-25 23:49Z); the failed build predated it, so merged master @ `0f8ce6e` to pick it up
Stress test (arm_debug) / Logical error "Unexpected exception in refresh scheduling" (STID 2508-34af) + Cannot start + Check failed	chronic RefreshTask::doScheduling shutdown race (29 master + 20+ unrelated PRs/30d), unrelated to row-policy rename	#105588 (ours, open)
Mergeable Check / PR	aggregate of the above	resolves once the merged-master CI run is green

Session id: cron:our-pr-ci-monitor:20260626-050000

clickhouse-gh · 2026-06-26T05:36:11Z

+            /// Re-key row policies last. Preflight above already rejected the unrecoverable cases
+            /// (read-only storage, destination or transient-name collision); this still has its own
+            /// rollback for any residual error, and nothing after it can throw.
+            applyRowPolicyRekeys(access_control, row_policy_rekeys);


This is not safe for Replicated databases when the affected row policies live in ReplicatedAccessStorage. The table rename is applied by each replica's replicated-DDL worker, but this AccessControl update writes the row-policy name to the shared access storage as soon as the first replica reaches this line. A lagging replica can then receive rp ON new_name while its local table is still old_name; queries there lookup old_db.old_table and then old_db.*, so the still-old table is temporarily unfiltered until that replica processes the DDL.

Please either coordinate the row-policy move with the replicated database DDL so all replicas stay protected, keep both old and new names protected until every replica has processed the rename, or reject this operation for affected policies in replicated access storage / Replicated databases. A focused integration test with <user_directories><replicated> and a lagging database replica would cover the invariant.

Confirmed. ReplicatedAccessStorage propagates the re-key on its own ZooKeeper path (global, immediate), while the table rename propagates through the Replicated database DDL queue (per-replica, async); the two channels are not synchronized, so a lagging replica briefly sees rp ON new_name with its local table still at old_name.

Scoping note: on master the policy does not follow the table at all, so a renamed table in a Replicated database stays unfiltered permanently. This PR reduces that to a transient window on a lagging replica only, so it does not introduce the exposure; fully closing the window is the multi-node face of the atomicity question already open at #108478 (comment).

On the fix: a fail-closed reject has to be pre-enqueue (DatabaseReplicated::checkQueryValid, next to the existing DETACH reject). A throw at this line runs inside replicated DDL execution after the entry is committed to ZooKeeper, so it cannot cleanly reject on a lagging replica. Of the three options, coordinate and dual-protect are architectural because ReplicatedAccessStorage policies are globally shared (there is no per-replica re-key and no all-replicas-caught-up barrier today); reject is the only option needing no new infrastructure but is backward-incompatible. Folding this into the same pending design decision rather than shipping a backward-incompatible reject unilaterally on a security PR.

clickhouse-gh · 2026-06-26T08:54:14Z

LLVM Coverage Report

Metric	Baseline	Current	Δ
Lines	85.40%	85.40%	+0.00%
Functions	92.60%	92.60%	+0.00%
Branches	77.60%	77.60%	+0.00%

Changed lines: Changed C/C++ lines covered by tests: 112/152 (73.68%) | Lost baseline coverage: none · Uncovered code

Full report · Diff report

clickhouse-gh · 2026-06-28T06:48:14Z

📊 Cloud Performance Report

✅ AI verdict: no_change — no significant changes across 38 queries analysed

This PR only changes RENAME TABLE / EXCHANGE / CREATE OR REPLACE DDL handling so that row policies follow the table (plus a new stateless test); the public and private diffs are identical and touch no query-execution code. The three flagged ClickBench improvements (Q4 -15.4%, Q15 -12.2%, Q28 -6.5%) exercise read-path SELECTs that never run any of the modified code, so these consistent-looking deltas are run-to-run variance, not a real effect of this change. All three have been downgraded to not-sure. No performance impact is expected from this PR.

clickbench

⚠️ 3 inconclusive

Flagged queries (3 of 43)

_{q-value = BH-FDR adjusted p; smaller is stronger evidence. MIRAI flags a query when q < fdr_q (default 0.10) — the value the verdict is based on.}

tpch_adapted_1_official

🟢 No significant changes

Debug info

StressHouse run: 52302b65-c788-4a3c-a749-2247e507435f
MIRAI run: b5b40abf-4bf9-4ebf-9d41-ddeda543b17f
PR check IDs:
- clickbench_287582_1782628168
- clickbench_287596_1782628168
- clickbench_287602_1782628168
- tpch_adapted_1_official_287609_1782628168
- tpch_adapted_1_official_287620_1782628168
- tpch_adapted_1_official_287634_1782628168

nikitamikhaylov added the can be tested Allows running workflows for external contributors label Jun 25, 2026

clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Jun 25, 2026

clickhouse-gh Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread src/Interpreters/InterpreterRenameQuery.cpp Outdated

clickhouse-gh Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread src/Interpreters/InterpreterRenameQuery.cpp Outdated

Merge commit '474a3c12c67' into ci-mon-merge-108478

0f8ce6e

clickhouse-gh Bot reviewed Jun 26, 2026

View reviewed changes

alexey-milovidov requested review from Algunenano, pufit, thevar1able and vitlibar July 1, 2026 05:52

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

groeneai commented Jun 25, 2026

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Description

Notes for reviewers (cc access-control maintainers)

Uh oh!

groeneai commented Jun 25, 2026

Uh oh!

groeneai commented Jun 25, 2026

Uh oh!

clickhouse-gh Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Review

Summary

PR Metadata

Findings

Tests

Final Verdict

Uh oh!

clickhouse-gh Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

groeneai commented Jun 25, 2026

Uh oh!

Uh oh!

groeneai commented Jun 25, 2026

Uh oh!

groeneai commented Jun 25, 2026

Uh oh!

groeneai commented Jun 25, 2026

Uh oh!

groeneai commented Jun 25, 2026

Uh oh!

Uh oh!

groeneai commented Jun 25, 2026

Uh oh!

groeneai commented Jun 25, 2026

Uh oh!

groeneai commented Jun 26, 2026

CI finish ledger — 3f8b2dd

Uh oh!

clickhouse-gh Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

groeneai Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot commented Jun 26, 2026

LLVM Coverage Report

Uh oh!

clickhouse-gh Bot commented Jun 28, 2026

clickbench

tpch_adapted_1_official

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

clickhouse-gh Bot commented Jun 25, 2026 •

edited

Loading

CI finish ledger — `3f8b2dd`