Hide MySQL and PostgreSQL databases from system.tables by default by pamarcos · Pull Request #104416 · ClickHouse/ClickHouse · GitHub
Skip to content

Hide MySQL and PostgreSQL databases from system.tables by default#104416

Merged
pamarcos merged 29 commits into
ClickHouse:masterfrom
pamarcos:rename-show-external-databases-setting
May 28, 2026
Merged

Hide MySQL and PostgreSQL databases from system.tables by default#104416
pamarcos merged 29 commits into
ClickHouse:masterfrom
pamarcos:rename-show-external-databases-setting

Conversation

@pamarcos

@pamarcos pamarcos commented May 8, 2026

Copy link
Copy Markdown
Member

Avoid implicit remote-table enumeration for remote database engines in system.tables, system.columns, and system.completions.

This prevents slow or unavailable MySQL/PostgreSQL upstreams from blocking startup dictionaries, backup metadata queries, and interactive system-table reads. system.databases still shows database names unconditionally, and SHOW TABLES FROM <db> still opts in for the requested remote database.

The old show_data_lake_catalogs_in_system_tables setting is kept as an alias. The new setting is show_remote_databases_in_system_tables.

Closes https://github.com/ClickHouse/clickhouse-private/issues/53621

Changelog category (leave one):

  • Backward Incompatible Change

Changelog entry:

The setting show_data_lake_catalogs_in_system_tables has been renamed to show_remote_databases_in_system_tables and broadened: when its value is 0 (the default), MySQL and PostgreSQL databases are also hidden from system.tables, system.columns, and system.completions, in addition to data lake catalogs. The old setting name is kept as an alias.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Note

Medium Risk
Changes default visibility of MySQL/PostgreSQL/DataLakeCatalog databases in system.tables, system.columns, and system.completions, which can affect tooling and queries that previously relied on implicit enumeration. Behavior is gated by a renamed setting with an alias, reducing but not eliminating upgrade risk.

Overview
Prevents implicit enumeration of remote database engines by excluding data lake catalogs, MySQL, and PostgreSQL databases from system.tables, system.columns, and system.completions unless the new show_remote_databases_in_system_tables setting is enabled.

Renames and broadens show_data_lake_catalogs_in_system_tables to show_remote_databases_in_system_tables (old name kept as an alias), and introduces IDatabase::isRemoteDatabase() with implementations for DataLake/MySQL/PostgreSQL plus DatabaseCatalog filtering support.

Ensures explicit SHOW TABLES/SHOW COLUMNS/SHOW INDEX against a remote database still works by forcing the setting on for those queries, and updates backups and tests to use the new setting and verify the new default behavior.

Reviewed by Cursor Bugbot for commit 980b830. Bugbot is set up for automated code reviews on this repo. Configure here.

Version info

  • Merged into: 26.6.1.212
  • Backported to: 26.5.2.16, 26.4.4.24, 26.3.13.22

pamarcos added 2 commits May 8, 2026 18:39
…bases_in_system_tables

Broaden the setting to also hide MySQL and PostgreSQL databases from
system.tables / system.columns / system.completions when disabled, since
they too require remote calls to enumerate tables. The old name is
preserved as an alias.

Internal renames: GetDatabasesOptions::with_datalake_catalogs ->
with_external_databases; DatabaseCatalog::isDatalakeCatalog ->
isExternalDatabase; hasDatalakeCatalogs -> hasExternalDatabases;
databases_without_datalake_catalogs -> databases_without_external.
Each assertion in the test now prints a short marker describing what
it checks, so a failure in the reference diff is self-explanatory.
@pamarcos pamarcos requested a review from Copilot May 8, 2026 18:54
@clickhouse-gh

clickhouse-gh Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

Comment thread src/Interpreters/DatabaseCatalog.h Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes how ClickHouse enumerates databases for system.tables, system.columns, and system.completions to avoid expensive per-table round trips when external database engines (PostgreSQL/MySQL and data lake catalogs) are present but slow/unavailable. It introduces a new setting name with an alias to preserve backwards compatibility while broadening the default filtering behavior.

Changes:

  • Renames show_data_lake_catalogs_in_system_tables to show_external_databases_in_system_tables (keeps the old name as an alias) and extends filtering to PostgreSQL/MySQL database engines.
  • Updates DatabaseCatalog and multiple system-table implementations to use GetDatabasesOptions{.with_external_databases = ...} consistently.
  • Adds a stateless regression test ensuring external DBs are hidden from system.tables/system.columns by default and that the alias is visible in system.settings.

Reviewed changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/queries/0_stateless/04210_show_external_databases_in_system_tables.sh New stateless test for default hiding + setting alias behavior.
tests/queries/0_stateless/04210_show_external_databases_in_system_tables.reference Expected output for the new stateless test.
tests/queries/0_stateless/03913_datalake_restful_catalog_bad_format.sh Updates test to use the new setting name.
src/Storages/System/StorageSystemTables.cpp Switches filtering to the new show_external_databases_in_system_tables setting.
src/Storages/System/StorageSystemReplicationQueue.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemReplicas.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemProjections.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemPartsBase.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemPartMovesBetweenShards.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemObjectStorageQueueSettings.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemMutations.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemKafkaConsumers.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemIcebergHistory.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemGraphite.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemDistributionQueue.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemDataSkippingIndices.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemDatabases.cpp Ensures system.databases continues to include external DBs unconditionally.
src/Storages/System/StorageSystemDatabaseReplicas.cpp Updates database enumeration option name (with_external_databases).
src/Storages/System/StorageSystemCompletions.cpp Uses the new setting to filter external DBs from completions by default.
src/Storages/System/StorageSystemColumns.cpp Switches filtering to the new show_external_databases_in_system_tables setting.
src/Storages/System/StorageSystemClusters.cpp Updates database enumeration option name (with_external_databases).
src/Storages/StorageMerge.cpp Updates database enumeration option name (with_external_databases).
src/Storages/RocksDB/StorageSystemRocksDB.cpp Updates database enumeration option name (with_external_databases).
src/Server/ReplicasStatusHandler.cpp Updates database enumeration option name (with_external_databases).
src/Interpreters/ServerAsynchronousMetrics.cpp Updates database enumeration option name (with_external_databases).
src/Interpreters/loadMetadata.cpp Updates database enumeration option name (with_external_databases).
src/Interpreters/InterpreterSystemQuery.cpp Updates database enumeration option name (with_external_databases).
src/Interpreters/InterpreterShowTablesQuery.cpp Forces the new setting on for SHOW TABLES when targeting an external DB.
src/Interpreters/InterpreterCreateQuery.cpp Updates database counting to use the renamed option field.
src/Interpreters/InterpreterCheckQuery.cpp Updates database enumeration option name (with_external_databases).
src/Interpreters/DatabaseCatalog.h Renames options/flags to “external databases” terminology and updates API.
src/Interpreters/DatabaseCatalog.cpp Implements external-DB classification and caches filtered DB maps.
src/Core/SettingsChangesHistory.cpp Records the renamed/broadened setting behavior in settings history.
src/Core/Settings.cpp Adds show_external_databases_in_system_tables with alias to the old name.
src/Client/BuzzHouse/Generator/SessionSettings.cpp Updates fuzzing setting list to the new setting name.
src/Backups/BackupEntriesCollector.cpp Updates database enumeration option name (with_external_databases).

Comment thread src/Interpreters/DatabaseCatalog.h Outdated
Comment thread src/Interpreters/DatabaseCatalog.cpp Outdated
Comment thread tests/queries/0_stateless/04210_show_external_databases_in_system_tables.sh Outdated
pamarcos added 3 commits May 8, 2026 19:03
The test only exercised the PostgreSQL engine path. MySQL takes the
same code path in DatabaseCatalog::isExternalDatabase, but a regression
that special-cases PostgreSQL would have slipped through. Add a MySQL
case using ATTACH (CREATE DATABASE ... ENGINE = MySQL connects eagerly,
ATTACH does not) with a short connect_timeout so the tolerated failure
is fast.
- DatabaseCatalog.h: fix grammar in the renamed comment ("are implement"
  -> "are implemented", "protect ourself" -> "protect ourselves").
- DatabaseCatalog.cpp:2244: update outdated TableNameHints comment that
  still said "Skip datalake catalogs" to reflect the broader scope.
- 04210: add an assertion that system.completions also does not list
  external databases by default.
The new stateless test runs in about 5 seconds and remains stable across repeated runs, so it should be eligible for fasttest to improve CI coverage.\n\nRef: ClickHouse#104416
@pamarcos pamarcos marked this pull request as ready for review May 8, 2026 19:12
Comment thread src/Interpreters/DatabaseCatalog.cpp Outdated
pamarcos added 2 commits May 8, 2026 19:31
Per review feedback from @nikitamikhaylov on PR ClickHouse#104416: replace the
free helper with engine-name string compare in DatabaseCatalog.cpp by
a virtual method on IDatabase. DataLake / MySQL / PostgreSQL override
to return true; the default is false, so other engines (Atomic,
Replicated, MaterializedPostgreSQL, ...) keep the right behavior
without needing to be aware of the new classification.

The naming is intentionally distinct from the existing isExternal()
("engine does not support ClickHouse internal tables") and
isDatalakeCatalog() (narrower). See the doc comment in IDatabase.h.
Fast test does not build libpq or libmysql, so CREATE DATABASE ENGINE =
PostgreSQL/MySQL silently does nothing in that environment, and the
test's reference no longer matches the actual output.

This is the same reason 01114_mysql_database_engine_segfault.sql and
03790_materialized_postgresql_nullptr_dereference.sql carry the tag.

Reverts removal in 527e58e. CI fast-test failure:
https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104416&sha=527e58ed25888cbc3061c7c1de91e18777c227ae&name_0=PR
@kssenii kssenii self-assigned this May 9, 2026
Comment thread src/Core/Settings.cpp
Comment thread src/Databases/PostgreSQL/DatabasePostgreSQL.h Outdated
Comment thread src/Databases/IDatabase.h Outdated
Comment thread src/Databases/IDatabase.h Outdated
Comment thread src/Interpreters/DatabaseCatalog.h Outdated
pamarcos and others added 2 commits May 11, 2026 13:27
Co-authored-by: Kseniia Sumarokova <sumarokovakseniia@gmail.com>
Rename the database-level helper to avoid confusion with IDatabase::isExternal
and include MaterializedPostgreSQL in remote database filtering.
@pamarcos pamarcos requested a review from kssenii May 11, 2026 14:39
Comment thread src/Interpreters/DatabaseCatalog.cpp Outdated
pamarcos added 4 commits May 13, 2026 10:09
Rename the new setting and `DatabaseCatalog` filtering option to use remote database terminology consistently, while keeping `show_data_lake_catalogs_in_system_tables` as the compatibility alias.

Convert the stateless coverage from `.sh` to `.sql` now that the SQL test runner can provide unique database names.

PR: ClickHouse#104416
Use remote database terminology consistently in comments changed by the PR and avoid the old external database wording near `isRemoteDatabase`.

PR: ClickHouse#104416
Explain `isExternal` in terms of database engines that do not own ClickHouse table metadata instead of using vague wording.

PR: ClickHouse#104416
@pamarcos pamarcos added this pull request to the merge queue May 28, 2026
Merged via the queue into ClickHouse:master with commit faa0e55 May 28, 2026
163 of 165 checks passed
@pamarcos pamarcos deleted the rename-show-external-databases-setting branch May 28, 2026 09:45
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label May 28, 2026
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label May 28, 2026
robot-ch-test-poll added a commit that referenced this pull request May 28, 2026
Cherry pick #104416 to 26.5: Hide MySQL and PostgreSQL databases from system.tables by default
robot-clickhouse added a commit that referenced this pull request May 28, 2026
robot-ch-test-poll2 added a commit that referenced this pull request May 28, 2026
Cherry pick #104416 to 26.3: Hide MySQL and PostgreSQL databases from system.tables by default
robot-clickhouse added a commit that referenced this pull request May 28, 2026
robot-ch-test-poll2 added a commit that referenced this pull request May 28, 2026
Cherry pick #104416 to 26.4: Hide MySQL and PostgreSQL databases from system.tables by default
robot-clickhouse added a commit that referenced this pull request May 28, 2026
@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label May 28, 2026
pamarcos added a commit that referenced this pull request May 29, 2026
Backport #104416 to 26.5: Hide MySQL and PostgreSQL databases from system.tables by default
pamarcos added a commit that referenced this pull request May 29, 2026
Remove accidentally backported `allow_experimental_geo_types_in_iceberg` so
`02995_new_settings_history` only sees the setting from PR #104416.

Update `test_database_glue/test_system_tables` to match the intended
`system.databases` behavior: remote database names are local metadata and stay
visible while table and column enumeration stays hidden by default.

PR: #106046
pamarcos added a commit that referenced this pull request Jun 1, 2026
Backport #104416 to 26.4: Hide MySQL and PostgreSQL databases from system.tables by default
pamarcos added a commit that referenced this pull request Jun 1, 2026
The 26.3 backport should only add `show_remote_databases_in_system_tables` from PR #104416. Dropping the unrelated setting keeps `02995_new_settings_history` aligned with the branch.

PR: #106045
pamarcos added a commit that referenced this pull request Jun 1, 2026
Backport #104416 to 26.3: Hide MySQL and PostgreSQL databases from system.tables by default
pamarcos added a commit that referenced this pull request Jul 2, 2026
Revert commit e79fd5f and return to the layout from commit 9c7c5ab and
commit 1fc129c: `show_remote_databases_in_system_tables` is recorded as
`{true, true}` in both the `26.2` and the `26.7` settings-history blocks.
The `26.2` entry covers the planned backports to 26.2+ release branches, and
the `26.7` entry satisfies the Upgrade check, which requires an entry in a
block newer than the 26.6 baseline release. Keeping both values `true` means
`compatibility` mode never rolls the restored default back to `false`,
matching the pre-#104416 behavior where `MySQL` and `PostgreSQL` databases
were always visible.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-backward-incompatible Pull request with backwards incompatible changes pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo v26.2-must-backport v26.3-must-backport v26.4-must-backport v26.5-must-backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants