Feature: Enable overlay databases for server. by AlyHKafoury · Pull Request #86768 · ClickHouse/ClickHouse · GitHub
Skip to content

Feature: Enable overlay databases for server.#86768

Open
AlyHKafoury wants to merge 89 commits into
ClickHouse:masterfrom
AlyHKafoury:allow-creating-overlay-databases
Open

Feature: Enable overlay databases for server.#86768
AlyHKafoury wants to merge 89 commits into
ClickHouse:masterfrom
AlyHKafoury:allow-creating-overlay-databases

Conversation

@AlyHKafoury

@AlyHKafoury AlyHKafoury commented Sep 5, 2025

Copy link
Copy Markdown
Contributor

Enables server-side Overlay databases. An Overlay database is a read-only facade that exposes the union of the tables of several underlying databases, resolving each table name through the listed sources in order (the first source that has the table wins). DDL on the facade is rejected — it has no storage of its own — while SELECT and pass-through INSERT resolve to the underlying source table. Reading or writing through the facade requires the corresponding grant on both the facade database and the underlying source, and the facade's row policies are combined with the source's. Previously an Overlay database existed only as the implicit default database of clickhouse-local.

Closes: #52764

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Allow creating Overlay databases.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@azat azat added the can be tested Allows running workflows for external contributors label Sep 5, 2025
@azat azat self-assigned this Sep 5, 2025
@clickhouse-gh

clickhouse-gh Bot commented Sep 5, 2025

Copy link
Copy Markdown
Contributor

@AlyHKafoury AlyHKafoury marked this pull request as ready for review September 7, 2025 13:04
@clickhouse-gh clickhouse-gh Bot added the pr-feature Pull request with new product feature label Sep 7, 2025
@AlyHKafoury

Copy link
Copy Markdown
Contributor Author

@azat I believe this is ready for initial review, I don't know why the rest of the CI is not running

@ClickHouse ClickHouse deleted a comment Sep 8, 2025
@alexey-milovidov

Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

Comment thread docs/en/engines/database-engines/overlay.md Outdated
@AlyHKafoury

Copy link
Copy Markdown
Contributor Author

@alexey-milovidov this test is failing with random error messages that doesn't happen in other runs or locally can you give me any pointers on why this might be failing ? https://github.com/ClickHouse/ClickHouse/actions/runs/17553376388/job/49853979804?pr=86768

@AlyHKafoury

Copy link
Copy Markdown
Contributor Author

@azat @alexey-milovidov I believe this is ready for review

@alexey-milovidov

Copy link
Copy Markdown
Member

@alexey-milovidov

Copy link
Copy Markdown
Member

Pushed three commits addressing the remaining review items:

  • 6949b05 — use TABLE_IS_PERMANENTLY_READ_ONLY for every read-only rejection through the facade. CREATE/ATTACH/RENAME/DROP/DETACH TABLE and OPTIMIZE TABLE previously threw the generic BAD_ARGUMENTS (in DatabaseOverlay, InterpreterDropQuery, InterpreterOptimizeQuery), while ALTER/DELETE/UPDATE/SYSTEM/TRUNCATE already used TABLE_IS_PERMANENTLY_READ_ONLY; they are now consistent. BAD_ARGUMENTS is kept only for genuine argument validation (self-reference, reference cycle, missing source at CREATE). (addresses the TABLE_IS_PERMANENTLY_READ_ONLY request)
  • b941adb — close the database-scoped / whole-server SYSTEM bypass: SYSTEM DROP REPLICA ... FROM DATABASE db_overlay is now rejected, and whole-server handlers skip read-only Overlay facades while iterating databases.
  • 398030a — rewrite Overlay source database names during backup/restore renaming, so RESTORE DATABASE db_a AS db_a2, DATABASE ov AS ov2 makes ov2 point at db_a2; databases are now created in SECONDARY_CREATE mode during RESTORE (like restored tables) so the facade can be created before its renamed sources exist.

All five 03611_overlay_* stateless tests pass locally against a release build (including the new 03611_overlay_database_restore_rename).

alexey-milovidov and others added 2 commits June 16, 2026 22:44
The previous commit set `interpreter.setIsRestoreFromBackup(true)` for every
database created by `RestorerFromBackup::createDatabase`. That was meant only to
let an `Overlay` facade be restored before its source databases exist (lazy
resolution skips the creation-time source-existence check), but applying it to
all databases also changed the loading-strictness level to `SECONDARY_CREATE`
and flipped the `distributed_backup_restore` flag for `Replicated` databases,
breaking their `RESTORE ON CLUSTER` (consistent `NO_ZOOKEEPER` failures across
`test_backup_restore_on_cluster`, `test_backup_restore_on_cluster_with_checksum_data_file_name`
and `test_refreshable_mv`).

Restrict the restore-from-backup mode to the `Overlay` engine, resolving the
engine name through `DatabaseFactory::resolveCanonicalEngineName` so it matches
regardless of casing (the parser lowercases `Overlay` to `overlay` because it
collides with the `OVERLAY` string function). Every other database engine
reverts to the previous master behavior.

CI: ClickHouse#86768

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A server-side `Overlay` facade is itself a local database, so it stays in
`DatabaseCatalog`'s `databases_without_remote` set. Bulk enumerators that must
avoid implicit calls to a remote service — `system.tables`, `system.columns`
and the asynchronous metrics — fetch `getDatabases({.with_remote_databases = false})`,
which returns that set, and then iterate each database's tables. Because
`DatabaseOverlay::getTablesIterator` walks every underlying source
unconditionally, an `Overlay('mysql_db', 'local_db')` would let those enumerators
reach `DatabaseMySQL::getTablesIterator` and query `INFORMATION_SCHEMA` even when
`show_remote_databases_in_system_tables` is disabled — exactly the implicit
remote call the setting suppresses.

Reject a remote source (`MySQL`/`PostgreSQL`/`DataLake`, i.e. any database whose
`isRemoteDatabase` returns true) at `Overlay` CREATE time. This keeps the
contract simple: an `Overlay` over a remote database is never persisted, so the
lazy `ATTACH` path on startup never has to deal with one.

Add `04338_overlay_database_rejects_remote_source` covering the rejection
(regardless of the source's position) and that an `Overlay` over only local
databases is still accepted.

CI: ClickHouse#86768

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Databases/DatabaseOverlay.cpp
alexey-milovidov and others added 3 commits June 16, 2026 23:28
Supersedes the previous commit's CREATE-time rejection of remote sources.
Instead of forbidding a remote source, `DatabaseOverlay::isRemoteDatabase`
now returns the `readonly` flag: a server-side (read-only) facade is reported
as remote, while the `clickhouse-local` (non-read-only) `Overlay` stays
non-remote.

This excludes the server-side facade from the default catalog enumeration
`getDatabases({.with_remote_databases = false})` used by `system.tables`,
`system.columns` and the asynchronous metrics, so an `Overlay` over a remote
source (`MySQL`/`PostgreSQL`/`DataLake`) no longer lets those consumers reach
the remote service and issue implicit calls (e.g. query `INFORMATION_SCHEMA`)
when `show_remote_databases_in_system_tables` is disabled. Explicit
`SHOW TABLES`/`SHOW COLUMNS`/`SHOW INDEXES` on the facade still list its tables
(they override the setting for remote databases), `system.databases` still
lists it, and direct queries through the facade are unaffected.

Answering from the constant `readonly` flag — rather than by resolving the
sources — sidesteps two problems a source-resolving implementation would have:
it does not depend on the order in which databases are loaded at startup, and
it touches no other database, so it is safe to call from
`DatabaseCatalog::attachDatabase`, which evaluates `isRemoteDatabase` while
holding `databases_mutex` (resolving sources there would re-enter the same
non-recursive mutex and deadlock).

Replace test `04338` with one that asserts the new contract: a read-only
`Overlay` (including one over a remote source) is hidden from `system.tables`
by default but visible with `show_remote_databases_in_system_tables = 1`,
`SHOW TABLES` and direct queries still work, and the `clickhouse-local`
`Overlay` keeps showing its tables in `system.tables`.

CI: ClickHouse#86768

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Databases/DatabaseOverlay.cpp Outdated
alexey-milovidov and others added 4 commits June 29, 2026 12:24
A read-only `Overlay` facade flattened a nested read-only `Overlay` source into its
leaf databases, dropping the middle facade from every runtime check. With
`top = Overlay('mid')` and `mid = Overlay('src')`, `SELECT ... FROM top.t` resolved
the storage to `src.t`, so the access and row-policy code -- which only sees the
written id (`top.t`) and the resolved storage id (`src.t`) -- never required the
grants or applied the row policies defined on `mid.t`. A user granted on `top.*` and
`src.*` but not `mid.*` could read through `top`, and the same shape applied to
synchronous and asynchronous `INSERT`.

Reject a read-only `Overlay` source instead of flattening it, both at CREATE time
(an immediate, friendly error) and at lazy resolution (the safety net that also
covers reference cycles formed after creation by dropping and re-creating a source).
This makes the nested-facade bypass impossible rather than relying on the analyzers
to re-derive and re-check the full resolution chain.

Reported by the AI review on ClickHouse#86768

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Its only user was the removed `resolveDatabasesImpl` declaration.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

I pushed a few changes on top of this PR:

Fixed the access/row-policy bypass for nested Overlay databases (the AI-review blocker, ac450bbc4d6/32bcb6015c9). A read-only Overlay previously flattened a nested read-only Overlay source into its leaf databases, dropping the middle facade from every runtime check — so with top = Overlay('mid') and mid = Overlay('src'), reading top.t only required grants on top.*/src.* and skipped the grants and row policies on mid.*. I took the reject option (rather than carrying the full chain): a read-only Overlay now refuses another read-only Overlay as a source, both at CREATE time and at lazy resolution in resolveDatabases (the safety net that also subsumes the previous reference-cycle detection). With nested facades impossible, the existing facade+source checks for SELECT, row policies and synchronous/asynchronous INSERT are exhaustive by construction.

  • Test 03611_overlay_database_nested_rejected.sql asserts the rejection and verifies a single-level facade still works under both analyzers; the cycle case in 03611_overlay_database_server_side.sql is now rejected at CREATE time.
  • Fixed the changelog entry (dropped the issue-closing text) and moved Closes: to its own line, per the AI metadata note; added a short description.
  • Merged current master.

CI: the only red was Integration tests (... db disk ...) / test_storage_delta_disks::test_single_log_file — all 8 sub-cases failed because node3 timed out starting (container running, no exception). That node's config does not use Overlay, the test is unrelated (Delta Lake), and the test is green on master (the only master blip, on 2026-06-29, was a different Spark-session failure). It is an infra flake from the 2026-06-22 run; the fresh push re-runs it against current master.

Left for your call (your open threads, not mechanical fixes): creating a table through the facade in the first underlying database (the author noted a UUID-collision limitation), and the empty()-returns-true rationale. I did not change those — they are design decisions.

Comment thread src/Databases/DatabaseOverlay.cpp
alexey-milovidov and others added 2 commits June 30, 2026 07:37
The test creates databases with fixed global names (`ov_nested_src`,
`ov_nested_mid`, `ov_nested_top`), so it cannot run concurrently with
copies of itself. The `Stateless tests (flaky check)` jobs run a test
many times in parallel, and a concurrent run dropped a source database
out from under another run between `CREATE DATABASE ov_nested_src` and
the facade `CREATE DATABASE ... ENGINE = Overlay('ov_nested_src')`,
producing `Overlay database requires existing underlying database
'ov_nested_src'` (and a later `Table ov_nested_mid.t does not exist`).

The regular stateless runs passed; only the parallel flaky check failed.
The sibling `03611_overlay_database_server_side.sql` and
`03611_overlay_database_restore_rename.sql` tests are already
`no-parallel` for the same reason.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ce check

`DatabaseOverlay::createTable`/`attachTable` already throw
`TABLE_IS_PERMANENTLY_READ_ONLY` for a read-only facade, but for a table
name that already resolves through the facade to a source table the
generic existence check in `InterpreterCreateQuery::doCreateTable`
short-circuits first: `database->isTableExist` returns true, so
`CREATE TABLE IF NOT EXISTS db_overlay.t` returned success and a plain
`CREATE TABLE db_overlay.t` reported `TABLE_ALREADY_EXISTS`, neither of
them ever reaching `DatabaseOverlay::createTable`. That violated the
documented read-only facade contract and let a user with only a
facade-scoped `CREATE TABLE` grant probe the existence of source tables.

Reject creation/attach on a read-only `Overlay` up front, before the
existence check, for both `CREATE` and `ATTACH` and regardless of
`IF NOT EXISTS`, mirroring the guards already used in
`InterpreterDropQuery`/`InterpreterAlterQuery`/etc.

Test `03611_overlay_database_create_existing_source_rejected` covers the
existing-source-table name (plain, `IF NOT EXISTS`, and `ATTACH`) as well
as a brand-new name, and verifies the source table is untouched.

CI: ClickHouse#86768

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Pushed ac450bbc4d6..6fc6b8e0021 (two commits):

AI-review blocker — read-only CREATE/ATTACH through the facade (6fc6b8e0021)

The clickhouse-gh "Request changes" finding was correct: although DatabaseOverlay::createTable/attachTable reject a read-only facade, for a table name that already resolves through the facade the generic existence check in InterpreterCreateQuery::doCreateTable (database->isTableExist) short-circuited first — so CREATE TABLE IF NOT EXISTS db_overlay.t returned success and a plain CREATE TABLE db_overlay.t reported TABLE_ALREADY_EXISTS, neither reaching the facade guard. That broke the documented read-only contract and let a facade-scoped CREATE TABLE grant probe source-table existence.

Now rejected up front, before the existence check, for both CREATE and ATTACH and regardless of IF NOT EXISTS, with TABLE_IS_PERMANENTLY_READ_ONLY — mirroring the guards already used in InterpreterDropQuery/InterpreterAlterQuery/etc. New test 03611_overlay_database_create_existing_source_rejected covers the existing-source-table name (plain / IF NOT EXISTS / ATTACH) and a brand-new name, and verifies the source table is untouched. The clickhouse-gh thread is resolved.

CI fix — 03611_overlay_database_nested_rejected flaky-check failures (8b67d5ceac1)

The only red stateless failures were this test, and only in the flaky check jobs (which run a test many times in parallel); the regular stateless runs were green. The test creates databases with fixed global names (ov_nested_src etc.), so a concurrent copy dropped a source database out from under another run between CREATE DATABASE ov_nested_src and the facade CREATE (Overlay database requires existing underlying database 'ov_nested_src'). Marked no-parallel, like the sibling 03611_overlay_database_server_side.sql / _restore_rename.sql.

Unrelated

Upgrade check (amd_release) failed on a benign background-mutation error (MutatePlainMergeTreeTask … FUNCTION_THROW_IF_VALUE_IS_NON_ZERO) that has nothing to do with Overlay databases — it is the class being allow-listed by the open #108651.

Verification: the changed InterpreterCreateQuery.cpp compiles cleanly under -Werror -Weverything; the new test reference is deterministic (the rejected DDL produces no output, the two SELECT count() return 2). Left untouched: your open design threads (creating a table in the first underlying database; the empty()-returns-true rationale).

@alexey-milovidov

Copy link
Copy Markdown
Member

Both red checks on the current head (6fc6b8e0021) are unrelated to this PR — they post-date the last push, so noting them here:

  • AST fuzzer (amd_debug)Logical error: Block structure mismatch in A stream: different columns (STID 0993-27f0), thrown in UnionStep::updateOutputHeadercheckHeaders during query-plan optimization of a fuzzed query. This is the known flake tracked by Logical error: Block structure mismatch in A stream: different columns: (STID: 0993-27f0) #108142 (127 failures across 103 distinct PRs in the last 14 days, last seen today 09:01Z), and it is auto-linked in the CI summary above. This PR touches no UnionStep/Block/header/optimization code, so it cannot be the cause.

  • CH Inc sync ("Failed. Needs manual intervention.") — not a conflict/build/test failure on the sync branch. The private IncrementalSync batch job aborted with github.GithubException.BadCredentialsException: 401 Bad credentials while iterating an unrelated PR (#106222 add-pcodec-codec), i.e. the robot's GitHub token went invalid mid-run. As a result the sync PR (clickhouse-private#59464) is simply stale (still at the 2026-06-13 head ea09edbff52) rather than broken — mergeable is fine. It just needs the IncrementalSync job to re-run once the credentials refresh; there is nothing to fix on this PR's side.

Everything else is green or still in progress: all builds (incl. arm_tidy), Style check, Fast test, Code Review, and Docs check are SUCCESS, and the AI review verdict on this head is ✅ Approve.

@alexey-milovidov

Copy link
Copy Markdown
Member

Merged fresh master into the branch (6fc6b8e0021..ed342205b4a, clean — no conflicts, Cargo.lock unchanged, PR diff intact at 45 files / +1773/-92).

Rationale: on the previous head the whole public CI was green (AI Review ✅ Approve, @azat approved, and the chronic AST fuzzer (amd_debug) flake #108142 had cleared) — the only remaining red was CH Inc sync. That was a stale status: the private sync PR clickhouse-private#59464 was stuck at the old 2026-06-13 head ea09edbff52 because the IncrementalSync batch job (job 84248802280) crashed on 2026-06-30 with a credentials error before it could advance this PR. Its CI failures there are from that stale snapshot, which predates the 06-30 security-hardening fixes (reject nested read-only Overlay sources; reject CREATE/ATTACH TABLE before the existence check), so they were not meaningful for the current code. The IncrementalSync infra has since recovered (all runs on 07-02 are green), but it does not re-process an unchanged public head — so a new commit is needed to trigger a fresh re-sync onto the current code. This re-merge provides that.

A fresh public CI run and a fresh private re-sync are now in flight on ed342205b4a.

The PR is otherwise merge-ready; it remains blocked solely on the standing 2025-09-08 CHANGES_REQUESTED review being cleared. The five open threads are the maintainer's own design discussions (create-table-in-first-underlying-DB, empty() semantics, two outdated overlay.md notes) — not mechanical fixes.

Comment thread tests/queries/0_stateless/03611_overlay_database_server_side.sql
Address the `clickhouse-gh` AI review "Request changes": the new server-side
`Overlay` coverage never created the same table name in two sources or repeated
a source in `Overlay(...)`, so the "first listed source wins" and
"duplicate source names are collapsed, keeping the first occurrence" contract
was documented and implemented but not pinned by a test. An earlier regression
on this branch (sources resolved in alphabetical order instead of the listed
order) would have left the existing suite green.

New `03611_overlay_database_lookup_precedence` creates the same table `t` in two
sources and verifies that:
  * `db_overlay_precedence.t` reads from the source listed first (the second
    source's `t` is shadowed, not unioned);
  * reversing the source order flips which source wins;
  * `Overlay('db_src_first', 'db_src_second', 'db_src_first')` collapses the
    duplicate, keeping the first occurrence, as shown by `SHOW CREATE DATABASE`
    and by `t` still resolving to `db_src_first`.

Marked `no-parallel` like the sibling `03611_overlay_database_*` tests, since it
creates databases with fixed global names.
@alexey-milovidov

Copy link
Copy Markdown
Member

Pushed ed342205b4a..934ca005a1c (one commit) addressing the clickhouse-gh AI-review "Request changes".

AI-review blocker — lookup-precedence contract had no test (934ca005a1c)

The finding was correct: the server-side coverage in 03611_overlay_database_server_side.sql never created the same table name in two sources or repeated a source in Overlay(...), so the "first listed source wins" and "duplicate source names are collapsed, keeping the first occurrence" rules were documented (docs/en/engines/database-engines/overlay.md) and implemented (registerDatabaseOverlay dedup + DatabaseOverlay::tryGetTable first-match) but never pinned by a test. The earlier alphabetical-order regression on this branch would have left the suite green.

New 03611_overlay_database_lookup_precedence creates the same table t in two sources and verifies:

  • db_overlay_precedence.t reads from the source listed first (the second source's t is shadowed, not unioned);
  • reversing the source order flips which source wins;
  • Overlay('db_src_first', 'db_src_second', 'db_src_first') collapses the duplicate while preserving the first occurrence — checked via SHOW CREATE DATABASE (renders the deduplicated list) and by t still resolving to db_src_first.

Marked no-parallel like the sibling 03611_overlay_database_* tests (fixed global database names). The clickhouse-gh thread is resolved.

CI (on the previous head ed342205)

Left untouched (your open design threads, not mechanical fixes): creating a table through the facade in the first underlying database (the UUID-collision limitation), and the empty()-returns-true rationale.


## Access control {#access-control}

Accessing a table through an `Overlay` database requires a grant on **both** the `Overlay`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This access-control contract is still wider than the implementation. The new dual-grant checks only cover SELECT and INSERT; metadata lookups through the facade still require only facade-side SHOW_* privileges.

For example, SHOW CREATE TABLE ov.t resolves ov.t, checks SHOW_COLUMNS on that facade name, and then asks DatabaseOverlay::getCreateTableQueryImpl for the definition, which walks the source databases and returns the underlying table metadata (src/Interpreters/Context.cpp:7596-7600, src/Interpreters/InterpreterShowCreateQuery.cpp:65-74, src/Databases/DatabaseOverlay.cpp:265-281). DESCRIBE ov.t, SHOW COLUMNS FROM ov.t, and EXISTS TABLE ov.t have the same shape (src/Interpreters/InterpreterDescribeQuery.cpp:181-190, src/Interpreters/InterpreterShowColumnsQuery.cpp:176-183 together with src/Storages/System/StorageSystemColumns.cpp:185-195, and src/Interpreters/InterpreterExistsQuery.cpp:54-75): they resolve or inspect the source table without ever re-checking the source database.

So a user with SHOW_COLUMNS / SHOW_TABLES on the Overlay but no privileges on the source can still enumerate source-table existence and schema through the facade. Please either enforce the source-side SHOW_* checks on these metadata paths or narrow the documented contract and tests to SELECT / INSERT only.

@clickhouse-gh

clickhouse-gh Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.50% 85.50% +0.00%
Functions 92.60% 92.60% +0.00%
Branches 77.70% 77.70% +0.00%

Changed lines: Changed C/C++ lines covered: 532/645 (82.48%) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow creating Overlay databases

3 participants