Support Remote and RemoteSecure storage engines#106189
Support Remote and RemoteSecure storage engines#106189alexey-milovidov wants to merge 31 commits into
Conversation
Implements #93138. Previously, a table over a remote server built from an ad-hoc set of addresses could only be created via `CREATE TABLE ... AS remote(...)`, i.e. by reusing the `remote` and `remoteSecure` table functions. This adds the persistent counterparts as storage engines, so that CREATE TABLE t ENGINE = Remote('addresses', db, table[, user[, password], sharding_key]) CREATE TABLE t ENGINE = RemoteSecure(...) work as well. This restores the symmetry between the table functions and the storage engines: `Distributed` uses a configured cluster, while `Remote`/`RemoteSecure` build a cluster on the fly from the addresses, exactly like the `remote`/`remoteSecure` table functions. The complex argument parsing previously living inside `TableFunctionRemote::parseArguments` is extracted into a reusable free function `parseRemoteFunctionArguments`, which is shared between the table functions and the new engines. The engines construct a `StorageDistributed` over the resulting owned cluster, so they accept exactly the same signatures as the table functions, and the table structure may be omitted (it is then inferred from the remote table). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`registerStorageRemote` in `StorageDistributed.cpp` (part of the `dbms`
library) called `parseRemoteFunctionArguments`, which was defined in
`TableFunctionRemote.cpp` in `clickhouse_table_functions`. That library is
not linked into every target that uses `dbms` (e.g. `unit_tests_dbms`),
so the build failed with an undefined symbol:
ld.lld-21: error: undefined symbol: DB::parseRemoteFunctionArguments(...)
Move `parseRemoteFunctionArguments` and the `ParsedRemoteFunctionArguments`
struct into a new file `Storages/Distributed/parseRemoteFunctionArguments.{h,cpp}`,
which is part of `dbms`, so the symbol is available wherever `dbms` is.
`TableFunctionRemote` keeps calling it through the new header (table
functions already depend on storages).
Also add the missing local forward declaration of `registerStorageRemote`
required by `-Wmissing-prototypes`.
CI: #106189
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When a `Remote`/`RemoteSecure` table is created without an explicit
structure, the structure is inferred from the remote table. Previously the
inference was left to the `StorageDistributed` constructor, which uses the
storage's global context. For a local target such as
`Remote('127.0.0.1', db, table)` that bypassed the user-context
`SHOW_COLUMNS` access check in `getStructureOfRemoteTableInShard`, so a user
allowed to create a `Remote` table could infer the schema of a local table
they are not allowed to describe.
Infer the columns in `registerStorageRemote` under `args.getLocalContext()`
(the user's context) and pass them to `StorageDistributed`, so the access
check is enforced.
CI: #106189
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The new `Remote` and `RemoteSecure` storage engines were not present in the `engine_to_type` map used by the ArrowFlight SQL server, so `test_arrowflight_interface/test_sql_server.py::test_get_table_types` failed with `Some engine(s) in system.table_engines are not mapped`. Map them to `REMOTE TABLE`, like `Distributed`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…gines `Remote` and `RemoteSecure` accept a positional `password`, but were not registered in `FunctionSecretArgumentsFinder::findTableEngineSecretArguments`, so `SHOW CREATE TABLE` / `system.tables.engine_full` printed the password instead of `[HIDDEN]`. Reuse `findRemoteFunctionSecretArguments`, which handles both the positional and the named-collection forms, since the engines accept exactly the same arguments as the `remote`/`remoteSecure` table functions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Setting `is_remote_function_` only adjusts the distributed client info; it does
not reproduce the local-shard access checks that `TableFunctionRemote::executeImpl`
performs. A user allowed to create `ENGINE = Remote('127.0.0.1:9000', secret_db,
secret_table, 'default')` could otherwise force the remote path (e.g. with
`prefer_localhost_replica = 0`), sending the query back to this server under the
stored credentials without checking the caller's access on the local target.
Mirror the table-function check in `registerStorageRemote`: when the built
cluster contains a local shard, require the creating user to hold both `SELECT`
and `INSERT` on the local target table (a persistent table can be used for both),
under the user's context.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When a persistent `Remote`/`RemoteSecure` table is created from a named collection (`ENGINE = Remote(my_collection, ...)`), the shared parser never passed the table's `StorageID` to `tryGetNamedCollectionWithOverrides`, so the table was not registered as a dependent of the collection. As a result, `DROP NAMED COLLECTION` was not blocked while the table existed and the dependency was not rebuilt on reload, which could leave a stored `Remote` table broken after reload or next use. Thread an optional `dependent_table_id` through `parseRemoteFunctionArguments` and pass `&args.table_id` from `registerStorageRemote`. Table-function callers leave it null, since their lifetime is bound to the query. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The failures of They were introduced by #104690 ("Add #104690 was merged in violation of the ClickHouse team rules: its own CI already showed these two tests failing (10 times between May 12 and June 1) before it was merged. Please update your branch to pick up the revert; the tests should pass again. |
For the persistent `Remote`/`RemoteSecure` engines, the local-shard access check applied `SELECT`/`INSERT` to `parsed.remote_table_id`. When the target is a table function (e.g. `numbers(...)`, `merge(...)`), that field is left at the parser default `system.one`, so the check validated the wrong object: it both spuriously required privileges on `system.one` and, with explicit columns, never analyzed the table function under the creating user's context. A query against such a persisted table could then be routed back to this server under the engine credentials, reading local data the creator was never checked for. Now, when the cluster contains a local shard and the target is a table function, the function is analyzed under the user's context even when the columns are given explicitly (`getStructureOfRemoteTableInShard` runs it through `getActualTableStructureWithAccess`, which performs the access check). The `SELECT`/`INSERT` check on `remote_table_id` is applied only for a plain table target. Added `04318_remote_storage_engine_access`, covering the persistent-engine security and lifecycle invariants: the local-shard `SELECT`/`INSERT` check with explicit columns, schema inference under the creating user's context, and the named-collection dependency that blocks `DROP NAMED COLLECTION` while the table exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Updated in c3e41a8 (pushed onto the existing branch, no rebase):
Built and verified locally: both |
|
The
CREATE TABLE t_remote_engine (dummy UInt8) ENGINE = Remote('127.0.0.{1,2}', system, one);
SELECT count() FROM t_remote_engine;with This is the standard I re-ran the failed job (workflow run 27483260206) — it passes when the runner skips the shard tests. @groeneai, could you investigate / track the |
|
@alexey-milovidov Confirmed: same root cause as the wider arm_darwin class. macOS binds only 127.0.0.1 on lo0 and does not auto-route the rest of 127.0.0.0/8, so 127.0.0.2:9000 is unroutable. The Scope over the last 7 days (Fast test arm_darwin, 127.0.0.2:9000 / ALL_CONNECTION_TRIES_FAILED): 15 distinct tests, 90 PRs, 165 failures, 0 on master. Fix is in flight: PR #107435 adds a macOS Fast-test pre-hook aliasing 127.0.0.2..127.0.0.16 on lo0 (the server already listens on 0.0.0.0), so these tests run instead of skip. It also un-skips 5 parallel-replica On the tracking issue: a dedicated tracker is not strictly needed since #107435 fixes the whole class at once. If you would prefer one, I can open an issue enumerating the remaining darwin.skip / shard-tag tail to un-skip incrementally after the aliases land. 04327/04336 are already skipped on master via #107376. |
`04308_remote_storage_engine` exercises the multi-shard case
`Remote('127.0.0.{1,2}', system, one)`, which connects to the second
loopback shard `127.0.0.2:9000`. On the `arm_darwin` Fast test runner,
macOS binds only `127.0.0.1` on `lo0` and does not auto-route the rest
of `127.0.0.0/8`, so `127.0.0.2` is unroutable and the query fails with
`ALL_CONNECTION_TRIES_FAILED`.
This is the same macOS shard-test infrastructure issue already handled
for the sibling tests `04327_rewrite_aggregate_function_with_if_distributed`
and `04336_parallel_blocks_marshalling_low_cardinality_native_format`,
which were added to `ci/defs/darwin.skip` in #107376. `04308` is a new
test introduced by this pull request, so it was not covered there. Add
it to `ci/defs/darwin.skip` for consistency until the runner-side fix
(#107435) aliases the additional loopback addresses and these tests can
run on macOS again.
The test itself is unchanged and keeps full coverage on Linux.
Report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=106189&sha=dbc0fc28753abffe3096e1064be57a6c440263b6&name_0=PR&name_1=Fast%20test%20%28arm_darwin%29
Related: #107376
Related: #107435
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rom metadata The `Remote`/`RemoteSecure` storage-engine creator analyzes a local-shard table-function target under the user's context to validate the creator's access to the function's underlying tables, even when the columns are given explicitly. This is a `CREATE`-time check, but the creator also runs when the table is loaded from already-validated metadata on server startup (`FORCE_ATTACH`) or `RESTORE` (`FORCE_RESTORE`). In that case re-running it is unnecessary and harmful: analyzing the target can throw if its underlying tables have changed since creation (e.g. the table matched by `merge(...)` was dropped), making a valid persisted table impossible to load. Guard the table-function inference and the local-shard `SELECT`/`INSERT` checks with `isLoadingFromExistingMetadata`, so they run for fresh definitions (`CREATE` and user `ATTACH`) but are skipped when loading from existing metadata. The structure is still inferred unconditionally when it was omitted, because then it is the table's only source of columns. Add an integration test that creates a `Remote` table over a local `view(...)` target, drops the table the view reads from, restarts the server, and checks that the `Remote` table still attaches. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Every integration test directory must contain an `__init__.py`; its absence
made the `Style check` fail with:
./tests/integration/test_remote_storage_engine_attach/__init__.py should exist for every integration test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A backup `RESTORE` of a `Remote`/`RemoteSecure` table that resolves to a
local shard must run the same local-shard `SELECT`/`INSERT` access check
that a direct `CREATE` does. Previously `args.is_restore_from_backup` was
folded into the `loading_from_metadata` guard together with
`isLoadingFromExistingMetadata`, so a normal `RESTORE` (which arrives with
`SECONDARY_CREATE` and `is_restore_from_backup`) skipped the check. A user
who could restore a table could therefore smuggle in
`Remote('127.0.0.1', protected_db, protected_table, 'default')` and reach a
local target they cannot access directly, even though a direct `CREATE`
would be rejected.
Split the guard: `loading_from_existing_metadata` (server startup only)
gates the plain local-shard `SELECT`/`INSERT` check, while
`skip_table_function_analysis` (startup or restore) still skips
re-analyzing a table-function target whose underlying tables may be absent
in the restore environment. Server startup keeps skipping both checks; a
backup `RESTORE` now runs the plain check again.
Addresses the AI-review blocker on the PR.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A persistent `Remote`/`RemoteSecure` table whose target is a table function (e.g. `numbers(...)`, `view(...)`) can be read, but cannot be written to: `remote_storage` is an empty `StorageID` whenever `remote_table_function_ptr` is set, so `DistributedSink` would build an `INSERT` into an empty table id instead of a real target. Such targets are read-only by nature. Make `StorageDistributed::write` throw `NOT_IMPLEMENTED` when `remote_table_function_ptr` is set, instead of producing a malformed query. This covers both the `remote`/`cluster` table functions and the `Remote`/`RemoteSecure` storage engines with a table-function target; no existing test inserts into such a target, so reading those targets is unaffected. Addresses the AI-review blocker on the PR. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Addressed the two remaining AI-review blockers (pushed onto the branch, no rebase):
Added the two requested regressions:
Built and verified locally: |
…t queue A persistent `Remote`/`RemoteSecure` table builds an ad-hoc cluster stored in `owned_cluster`, exactly like the `remote()`/`cluster()` table functions. But `StorageDistributed::write` forced synchronous insertion whenever `owned_cluster` was set, so a persistent `Remote` table could never use the asynchronous `Distributed` queue even with `distributed_foreground_insert = 0`. As a result `fsync_after_insert`, `background_insert_*` and the validated `bytes_to_delay_insert` / `bytes_to_throw_insert` surface were silently ineffective for it. `owned_cluster` was only ever a proxy for "this is a `remote()` table function", which has no persistent storage to spool to. That condition is already captured precisely by `relative_data_path.empty()`: every table-function storage is temporary and has an empty data path, while a persistent `Remote` engine has a real data path. Key the forced-sync fallback on `relative_data_path.empty()` instead of `owned_cluster`, so the persistent engine inserts asynchronously like a regular `Distributed` table, while the table functions keep inserting synchronously (they still match `relative_data_path.empty()`). The asynchronous queue works for an owned cluster: `parseAddresses` resolves both the full and compact directory-name formats against `getCluster()` (which returns `owned_cluster`), and `createPool` honors `address.secure`, so `RemoteSecure` inserts stay secure. Addresses the AI-review blocker on the PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolve the conflict in `src/TableFunctions/TableFunctionRemote.cpp`: this branch
moved the argument parsing out of `TableFunctionRemote::parseArguments` into the
shared `parseRemoteFunctionArguments`, while master's #b09d93f9920 ("Fix null AST
in remote() named-collection database override") changed the now-extracted code.
Keep this branch's call site and port master's fix into
`parseRemoteFunctionArguments`, so the `database =`/`db =` named-collection branch
treats a non-table-function argument as a database name instead of assigning a null
`remote_table_function_ptr`. Without the port the merge would have silently
reverted that master bug fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@groeneai, investigate and fix This is surprising to me, didn't see it before. |
|
@alexey-milovidov signal 4 is SIGILL from the Full stack (Stress test (amd_asan_ubsan), STID 1781-3414): Root cause: This is pre-existing and unrelated to this PR: it touches none of ClientInfo/SocketAddress/TCPHandler, the line dates to Feb 2025, and the same STID fires on the unrelated #97302. Fix: in |
|
Fix PR: #108410 (validates the wire address in |
|
Merged current
The only remaining CI red on the previous commit was the chronic, master-wide stress hung-check ( The AI-review |
|
CI re-ran on the same commit (
So the PR is still green apart from an unrelated flake. The sole substantive blocker remains the AI-review |
|
CI has fully settled on the current commit ( The sole remaining item is the AI-review Ready to merge modulo that decision. |
…ESTORE
A backup `RESTORE` of a `Remote`/`RemoteSecure` table with a table-function
target that resolves to a local shard (e.g.
`Remote('127.0.0.1', merge(db, '^protected$'), 'default')`) previously skipped
the table-function analysis unconditionally, because `args.is_restore_from_backup`
was folded into `skip_table_function_analysis`. A restricted user who may restore
could therefore restore such a table with explicit columns and reach a local
target they cannot access directly, even though a direct `CREATE` runs the
analysis under the user's context (via `getStructureOfRemoteTable` ->
`getActualTableStructureWithAccess`) and would be rejected. The plain-target
`SELECT`/`INSERT` check was already enforced on restore; this closes the same
hole for table-function targets.
Run the analysis on restore as well, but fail closed only on an access-control
error (`ACCESS_DENIED`): the target's underlying tables may legitimately be
absent in the restore environment (e.g. the table matched by `merge(...)` was
dropped since the backup was taken), and a valid persisted table must still be
restorable in that case (`04341_remote_storage_engine_restore`). An access
failure is the exact case a direct `CREATE` rejects and the only one that could
leak a local target; any other failure means the target cannot be analyzed (and
therefore cannot be read either), so the restore proceeds with the columns
carried in the backup metadata.
Add `04489_remote_storage_engine_restore_tf_access` covering both: a restricted
user's restore of `Remote(..., merge('^protected$'), ...)` is rejected without
target access and allowed with it, while a restore over a genuinely absent
target still succeeds.
Addresses the AI-review blocker on the PR.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Addressed the remaining AI-review blocker (pushed onto the branch, no rebase):
The analysis now runs on restore too, failing closed only on Regression added: Verified locally against a Release build ( The inline AI-review thread on |
|
To be explicit about the earlier "that trade-off is your design call" framing: this commit is offered as a concrete proposal, not an override — if you prefer to keep the previous unconditional skip on restore, The key point is that it does not reintroduce the un-restorable-valid-table regression that motivated the skip: the restore still tolerates a genuinely absent table-function target (any non- |
| /// When the structure was carried in the definition, the analysis runs purely for its access | ||
| /// side effect, so on restore a non-access failure (absent target) may be tolerated. When the | ||
| /// structure was omitted, the analysis is the only source of columns and must always succeed. | ||
| const bool tolerate_absent_target = !columns.empty() && args.is_restore_from_backup; |
There was a problem hiding this comment.
When this RESTORE path hits any local table-function-target failure other than ACCESS_DENIED, it silently trusts the persisted columns and keeps the definition. That is not fail-closed for a fresh user-driven restore: if Remote('127.0.0.1', merge(db, '^protected$'), 'default') is restored while db.protected is temporarily absent, the restoring user never proves access to that local target, yet the stored engine credentials will start reading it as soon as a matching table is recreated later. Please make the default RESTORE path reject local table-function targets whose access cannot be established, or gate this bypass behind an explicitly trusted/internal restore mode instead of swallowing every non-access exception here.
|
Recommendation: keep The two branches fail closed differently, and that is the crux:
That is why 04341 (valid table, target legitimately gone, must restore) and the bot's "absent-now, present-later" vector are indistinguishable at restore time from the definition alone: the only discriminator is a future event plus the restoring user's trust level, neither knowable when the analysis runs. On the bot's vector: it is real but lower severity than the one you closed. It needs (1) the restoring user to already hold a backup with privileged engine credentials baked in, which only someone who already had access to the target could have produced and handed over, (2) the target absent at restore, and (3) the target reappearing later. That is the same trust surface as "Remote persists credentials and reads under them," not a new escalation primitive. The commit closes the immediate, no-future-event case (present target, ACCESS_DENIED fatal), which is the higher-value fix. If you want the clickhouse-gh thread cleared rather than documented as an accepted structural limit, the only principled closure is the bot's second option: gate the non-ACCESS_DENIED swallow behind a trusted/internal restore path, so a user-driven restore fails closed on an unresolved local table-function target while a system/admin restore keeps 04341's tolerance (04341 restores as the default user, so it stays green; 04489's absent-target case would move to restricted-user-rejected). The bot's first option (reject all unresolved targets on the default path) would break 04341. That gate is a separable change; I would not fold it into this PR unless you want it here, since the current commit is a strict improvement on its own. Net: keep it. Either document the table-function residual as a known restore limitation, or follow up with the trusted-restore gate if the block must clear. |
LLVM Coverage Report
Changed lines: Changed C/C++ lines covered: 292/297 (98.32%) · Uncovered code |
|
The sole remaining CI red on
This is a chronic, master-wide flake, not on the Its CI auto-tracking issue #105306 was closed on 2026-05-19, but the test has since regressed (9 failures in the last 30 days), so the earlier fix no longer holds. @groeneai, please investigate |
|
@alexey-milovidov thanks, agreed this is unrelated to the Remote/RemoteSecure work here. I investigated Reproducer (compact part, Important finding: #104822 in its current form does not fully close the Fix direction: capture the merge's metadata and mutations snapshots as a consistent pair (e.g. under |
|
Fix pushed. Root cause of the residual: a merge applies the pending rename to its source parts on-fly, but the merged part kept the source mutation version, so the rename mutation ran again on the already-renamed compact part, re-read the column by its old name (gone), and wrote it empty. The fix stamps a merged part with the metadata-mutation version it applies on-fly, so the rename is not re-scheduled. 0 data loss over 2366 iterations of the reproducer (was ~8% on master); details on #104822. |

Implements #93138.
Previously, a table over a remote server built from an ad-hoc set of addresses could only be created via
CREATE TABLE ... AS remote(...), reusing theremote/remoteSecuretable functions. This PR adds the persistent counterparts as storage engines:This restores the symmetry between the table functions and the storage engines:
Distributeduses a configured cluster, whileRemote/RemoteSecurebuild a cluster on the fly from the supplied addresses, exactly like theremote/remoteSecuretable functions. The table structure may be omitted, in which case it is inferred from the remote table.The complex argument parsing previously living inside
TableFunctionRemote::parseArgumentsis extracted into a reusable free functionparseRemoteFunctionArguments, shared between the table functions and the new engines, so they accept exactly the same signatures with no duplicated logic. The engines construct aStorageDistributedover the resulting owned cluster.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Added the
RemoteandRemoteSecuretable engines, the persistent counterparts of theremoteandremoteSecuretable functions.CREATE TABLE ... ENGINE = Remote('addresses', db, table, ...)now works in addition toCREATE TABLE ... AS remote(...).Documentation entry for user-facing changes
🤖 Generated with Claude Code