Fix table engine and database engine grants with source access checks by pufit · Pull Request #100746 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix table engine and database engine grants with source access checks#100746

Merged
nikitamikhaylov merged 15 commits into
ClickHouse:masterfrom
pufit:pufit/fix-table-engine-grants-v2
May 5, 2026
Merged

Fix table engine and database engine grants with source access checks#100746
nikitamikhaylov merged 15 commits into
ClickHouse:masterfrom
pufit:pufit/fix-table-engine-grants-v2

Conversation

@pufit

@pufit pufit commented Mar 25, 2026

Copy link
Copy Markdown
Member

Re-introduces the dynamic engine lookup from #98984 (reverted in #100684) with a fix for the regression that broke CREATE DATABASE.

The original PR replaced the hardcoded source_table_engines list with StorageFactory::getAllStorages(), which correctly covers all table engines. However, CREATE DATABASE also checks TABLE_ENGINE access in InterpreterCreateQuery, and database engines (Atomic, Shared, Replicated, etc.) are registered in DatabaseFactory, not StorageFactory. When table_engines_require_grant=false, the old wildcard grant(TABLE_ENGINE) covered everything implicitly — the new per-engine approach missed database engines entirely, causing ACCESS_DENIED on CREATE DATABASE.

What changed

StorageFactory loop (from #98984): replaces the hardcoded source_table_engines list with runtime StorageFactory::getAllStorages(). Each table engine declares its source_access_type at registration, so the implicit grants are always in sync. Uses grant-only (no revoke) to preserve WITH GRANT OPTION state (#71544).

DatabaseFactory: source_access_type (new): adds std::optional<AccessTypeObjects::Source> source_access_type to DatabaseFactory::EngineFeatures, mirroring StorageFactory::StorageFeatures. Source database engines now explicitly declare their source dependency:

Database Engine source_access_type
MySQL MYSQL
PostgreSQL POSTGRES
MaterializedPostgreSQL POSTGRES
SQLite SQLITE
S3 S3
HDFS HDFS
Filesystem FILE

Non-source engines (Atomic, Ordinary, Memory, Dictionary, Replicated, Backup) default to std::nullopt — granted implicitly when table_engines_require_grant=false, no change in behavior.

DataLakeCatalog is left without source_access_type (polymorphic — underlying source depends on runtime catalog type). Noted as TODO.

DatabaseFactory loop in addImplicitAccessRights: now mirrors the StorageFactory loop exactly — iterates getDatabaseEngines(), checks source_access_type, bridges READ/WRITETABLE_ENGINE for source engines regardless of the table_engines_require_grant setting. The previous implementation relied on name collisions between the two factories (correct by coincidence, not by design).

Also carried forward from #98984:

Changelog category (leave one):

  • Bug Fix

Changelog entry:

Replace the hardcoded source_table_engines list with runtime lookup via StorageFactory and DatabaseFactory. Add source_access_type to DatabaseFactory::EngineFeatures so that CREATE DATABASE with source engines (PostgreSQL, MySQL, S3, etc.) requires the same source grants as CREATE TABLE. Fixes GRANT TABLE ENGINE ON * failing with table_engines_require_grant=false. Closes #71544

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Generated by Nerve

Version info

  • Merged into: 26.5.1.302

Re-introduce dynamic engine lookup from StorageFactory (replacing the
hardcoded source_table_engines list) and additionally grant database
engines from DatabaseFactory when table_engines_require_grant=false.

The previous attempt (ClickHouse#98984) missed database engines (Atomic, Shared,
Replicated, etc.) which are registered in DatabaseFactory, not
StorageFactory.  CREATE DATABASE checks TABLE_ENGINE access in
InterpreterCreateQuery, so omitting them caused ACCESS_DENIED on
CREATE DATABASE.

Closes ClickHouse#71544
@clickhouse-gh

clickhouse-gh Bot commented Mar 25, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [90027fc]

Summary:


AI Review

Summary

This PR correctly restores dynamic engine-based implicit TABLE_ENGINE bridging and fixes the CREATE DATABASE regression by adding a DatabaseFactory pass and source-aware database-engine metadata. However, there is still a privilege-gap blocker for DataLakeCatalog and important coverage/maintainability gaps, so this should not merge yet.

Findings

❌ Blockers

  • [src/Databases/DataLake/DatabaseDataLake.cpp:1091] DataLakeCatalog is still registered without source_access_type, so with table_engines_require_grant = false it can be implicitly granted as a non-source engine even though it may read external sources (S3/Azure/HDFS) at runtime.
    • Impact: source access checks can be bypassed for database creation.
    • Suggested fix: enforce source checks for DataLakeCatalog (derive required source type from catalog settings and require matching READ/WRITE ON <SOURCE>), or block this engine until source type can be validated safely.

⚠️ Majors

  • [src/Databases/DatabaseFactory.h:3] Adding direct include of AccessType.h into a high-fanout header introduces avoidable transitive compile-time cost.
    • Suggested fix: forward-declare AccessTypeObjects::Source (or introduce a lightweight forward header) in DatabaseFactory.h, and keep full AccessType.h include in .cpp files.
  • [tests/integration/test_grant_and_revoke/test_without_table_engine_grant.py:122-149] The regression tests still do not exercise the critical REVOKE READ, WRITE ON <SOURCE> path after GRANT TABLE ENGINE ... WITH GRANT OPTION for both storage and database source engines.
    • Impact: the root_with_grant_option regression risk called out in code comments remains under-tested.
    • Suggested fix: add revoke/re-grant regression cases proving grant-option delegation survives source-right synchronization for both URL and Filesystem/another source database engine.
Tests
  • ⚠️ Add explicit regression coverage for grant-option persistence across source revoke/re-grant transitions (table source engine and database source engine paths).
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny ⚠️ DataLakeCatalog source authorization remains unsafely unspecified
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout ⚠️ Source-check bypass risk for DataLakeCatalog blocks safe rollout
Compilation time ⚠️ New heavy include in DatabaseFactory.h
No large/binary files
Final Verdict

Status: ⚠️ Request changes

Minimum required actions:

  1. Close the DataLakeCatalog source-access gap (source_access_type-equivalent enforcement or explicit safe block).
  2. Add regression tests for grant-option behavior across source revoke/re-grant synchronization paths.

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Mar 25, 2026
pufit added 2 commits March 30, 2026 13:36
…ine-grants-v2

# Conflicts:
#	tests/integration/test_grant_and_revoke/test_without_table_engine_grant.py
@pufit pufit marked this pull request as ready for review April 8, 2026 16:36
Comment thread tests/integration/test_grant_and_revoke/test_without_table_engine_grant.py Outdated
instance.query("DROP USER IF EXISTS A, B")
instance.query("CREATE USER A")
instance.query("CREATE USER B")
instance.query("GRANT TABLE ENGINE ON * TO A WITH GRANT OPTION")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This regression test currently verifies only that GRANT TABLE ENGINE ... WITH GRANT OPTION can be delegated once.

The bug fixed in ContextAccess::addImplicitAccessRights was specifically about revoke-based source synchronization wiping root_with_grant_option. Please extend this test to cover that path explicitly, e.g. grant READ, WRITE ON URL to A, then REVOKE READ, WRITE ON URL FROM A, and assert A can still execute GRANT TABLE ENGINE ON * TO B.

@alexey-milovidov

Copy link
Copy Markdown
Member

The flaky test failures in this PR (01605_drop_settings_profile_while_assigned, 01418_custom_settings, 02015_async_inserts_4) are caused by non-unique access entity names and will be resolved by #102131.

@alexey-milovidov

Copy link
Copy Markdown
Member

The 02346_text_index_bug101913 test failure is fixed by #102108 (already merged). Please rebase or merge master to pick up the fix.

@alexey-milovidov

Copy link
Copy Markdown
Member

The MSan stress test failure (MemorySanitizer: use-of-uninitialized-value, STID 4179-5154 or 4148-3044) is a known pre-existing issue unrelated to this PR. Fix: #102158

alexey-milovidov and others added 3 commits April 9, 2026 00:37
Database engines that share names with source storage engines
(PostgreSQL, MySQL, SQLite, S3, MaterializedPostgreSQL) were
unconditionally granted TABLE_ENGINE, overriding the source-aware
checks from the StorageFactory loop. Skip names already present
in StorageFactory — those were already handled correctly.
Database engines that connect to external sources (PostgreSQL, MySQL,
S3, etc.) now declare their source dependency via source_access_type,
mirroring how StorageFactory works. The implicit grants logic in
addImplicitAccessRights applies the same READ/WRITE → TABLE_ENGINE
bridge for both factories.

Previously, source checks for database engines relied on name collisions
with StorageFactory — correct by coincidence, not by design. Database-only
engines like Filesystem bypassed source access checks entirely.

Source database engine registrations:
- MySQL → MYSQL, PostgreSQL → POSTGRES, MaterializedPostgreSQL → POSTGRES
- SQLite → SQLITE, S3 → S3, HDFS → HDFS, Filesystem → FILE

DataLakeCatalog left without source_access_type (polymorphic; underlying
source depends on runtime catalog type).
@pufit pufit changed the title Fix table engine grants, including database engines Fix table engine and database engine grants with source access checks Apr 9, 2026
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp Outdated
Clang errors on designated initializers in a default argument when a
member has a default member initializer inside the enclosing class.
Use `= {}` instead, which value-initializes all fields from their
in-class defaults.
@@ -1,5 +1,6 @@
#pragma once

#include <Access/Common/AccessType.h>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatabaseFactory.h is included by many translation units, and this new direct include of AccessType.h pulls in a heavy access-rights dependency chain (macros, alias maps, boost helpers) into every includer.

This is a compilation-time regression risk in a hot header. You only need the type name for std::optional<AccessTypeObjects::Source> here.

Please avoid the transitive include by introducing a lightweight forward declaration header for source objects (or equivalent forward declaration of AccessTypeObjects::Source) and include full AccessType.h only in .cpp files that require definitions.

Clang requires default member initializers to be visible inside the
enclosing class definition. Using `= {}` still triggers this because
all fields rely on their default member initializers. List all 4
fields explicitly so the compiler doesn't need in-class defaults.
@alexey-milovidov

Copy link
Copy Markdown
Member

The flaky check failure is fixed in #102148, let's update the branch.

@pufit

pufit commented Apr 10, 2026

Copy link
Copy Markdown
Member Author

@groeneai can you check tests here?

@groeneai

Copy link
Copy Markdown
Contributor

@pufit I checked the CI results on the latest commit (35163e4). Here's the summary:

✅ CI is clean

All regular checks pass — builds, stateless tests, integration tests, stress tests, AST fuzzers, BuzzHouse, unit tests, performance comparison, compatibility checks — all green.

Zero test failures on the latest commit outside of bugfix validation and flaky checks.

Bugfix validation — correct ✅

All 20 test_grant_and_revoke tests (both test_with_table_engine_grant.py and test_without_table_engine_grant.py) show FAIL in bugfix validation, which is the expected outcome — these tests correctly fail against the master baseline, proving the bug exists. The overall Bugfix validation check reports success.

Remaining flaky check failures — pre-existing, unrelated

Test Check Status
00746_sql_fuzzy flaky check (amd_binary) Known flaky — fails across 25 unrelated PRs in 30 days, 0 master hits
02835_drop_user_during_session flaky check (amd_msan) Known flaky — fails across 21 unrelated PRs in 30 days, 0 master hits

Neither is related to your changes.

Earlier failures — all resolved

  • test_table_engine_and_source_grant integration test failures (seen on commits before 07ae12): fixed by your latest pushes. Now passes across all 7 integration test configurations (amd_tsan, amd_msan, amd_binary, arm_binary, amd_llvm_coverage, amd_asan_ubsan targeted + flaky).
  • ARM Azure batch failure (19 stateless tests on Apr 9): infrastructure event affecting arm_asan_ubsan, azure — not PR-related. Passed on retry.
  • Flaky tests flagged by alexey-milovidov (01605_drop_settings_profile_while_assigned, 02015_async_inserts_4, 02346_text_index_bug101913): all pre-existing. 02346 was fixed by PR Revert "Fix text_index optimization on multiple passes." #102108 (merged).

LGTM from a CI perspective — no PR-caused failures remaining. 👍

fm4v added a commit that referenced this pull request Apr 14, 2026
Split flaky check and targeted check into separate roles:

Flaky check — runs only new/modified tests from the PR with thread
fuzzer enabled. Tests run in parallel with themselves (shared queue)
to ensure new tests are parallel-safe under randomized thread ordering.

Targeted check — runs previously failed and coverage-relevant tests
with per-worker partitioning (--no-self-parallel) so the same test
never runs concurrently on multiple workers, avoiding conflicts on
shared resources (hardcoded users, roles, settings profiles, etc.).
Both use --test-runs 50.

Changes:
- Add --no-self-parallel flag to clickhouse-test: partitions tests
  by name hash so all copies go to the same worker
- Flaky check: only changed tests, thread fuzzer, shared queue
- Targeted check: relevant tests, no thread fuzzer, per-worker queues,
  50 runs, single clickhouse-test invocation
- Restore original thread fuzzer parameters (reverts reductions from
  06c945a and 9790435)
- Add amd_binary to flaky check, use AMD_LARGE for targeted debug
- Targeted check uses --flaky-check --no-self-parallel

PRs affected by tests running in parallel with themselves:
#102015 #101093 #99653 #101503 #101108 #101884 #100746
#101447 #100941 #101099 #100203 #96130

#102038
fm4v added a commit that referenced this pull request Apr 14, 2026
Split flaky check and targeted check into separate roles:

Flaky check — runs only new/modified tests from the PR with thread
fuzzer enabled. Tests can run in parallel with themselves (shared
queue) to ensure new tests are parallel-safe.

Targeted check — runs previously failed and coverage-relevant tests
with per-worker partitioning (--no-self-parallel) so the same test
never runs concurrently on multiple workers, avoiding conflicts on
shared resources (hardcoded users, roles, settings profiles, etc.).
Both use --test-runs 50.

Changes:
- Add --no-self-parallel flag to clickhouse-test: partitions tests
  by name hash so all copies go to the same worker
- Flaky check: only changed tests, thread fuzzer, shared queue
- Targeted check: relevant tests, no thread fuzzer, per-worker
  partitioning, 50 runs, single clickhouse-test invocation
- Restore original thread fuzzer parameters
- Change targeted check to arm_asan_ubsan on ARM_LARGE

PRs affected by tests running in parallel with themselves:
#102015 #101093 #99653 #101503 #101108 #101884 #100746
#101447 #100941 #101099 #100203 #96130
Resolve conflicts in DatabaseFactory and per-engine registrations
caused by the new `is_external` field added on master and the
`source_access_type` field added in this branch. Both fields
coexist; `is_external` is used by restore, `source_access_type`
by `addImplicitAccessRights`.
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp Outdated
Co-authored-by: clickhouse-gh[bot] <173705893+clickhouse-gh[bot]@users.noreply.github.com>
@groeneai

Copy link
Copy Markdown
Contributor

@pufit follow-up on the CI status for the current commit (107bd098, run started today). The notification just resurfaced via the LLVM-coverage bot — nothing new actionable vs. my Apr 10 triage:

  • 21 Bugfix validation (integration tests) FAILs on test_with_table_engine_grant.py / test_without_table_engine_grant.py are the expected outcome — these are the regression tests this PR adds, and bugfix-validation runs them against the master baseline binary where the bug still exists. The job-level check itself reports pass (4m1s).
  • 1 transient flake on Unit tests (asan_ubsan, function_prop_fuzzer): runner reported "No test result file [...gtest.json]". The earlier run on this same commit passed (fail: 0, passed: 1), and the sister msan / tsan function_prop_fuzzer variants both passed cleanly today. Infrastructure issue on this single asan run — a re-trigger should clear it.
  • Fast test (arm_darwin) failures are the known force_success-stabilized infrastructure noise.
  • Finish Workflow / Post Hooks is a job-level wrapper, not an actionable per-PR test failure.
  • All completed builds green; the rest of the matrix is still in progress.

Will ping again only if anything substantive surfaces once the in-flight checks complete.

std::unordered_map<std::string, AccessType> result;
/// Use DB::toString() to build keys so they match what replaceDeprecated() stores
/// (DB::toString replaces underscores with spaces, e.g. ARROW_FLIGHT -> "ARROW FLIGHT").
#define ADD_BACKWARD_COMPAT_SOURCE(name, alias) result.emplace(DB::toString(AccessType::name), AccessType::name);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makeBackwardCompatible now builds string_to_accessType keys with DB::toString(AccessType::name). For ARROW_FLIGHT that produces "ARROW FLIGHT", but source parameters are normalized via AccessTypeObjects::unifySource to "ARROW_FLIGHT" (underscore) in AccessFlags::validateParameter.

So with enable_read_write_grants = false, READ/WRITE ON ARROW_FLIGHT will not map back to the deprecated ARROW_FLIGHT access type, and backward-compatible formatting/replication output stays in the new form.

Please include the canonical source name (#name / toStringSource(Source::name)) in this map (or both forms) so ARROW_FLIGHT is converted correctly.

`makeBackwardCompatible` built map keys only from
`DB::toString(AccessType::name)` (spaces) but parsing modern
`GRANT READ ON ARROW_FLIGHT` stores the parameter in canonical
`#name` form via `unifySource`. Lookup missed for ARROW_FLIGHT —
the only source whose name has an underscore — so the grant was not
converted back to the deprecated form when
`enable_read_write_grants=false`. Insert both forms.
@clickhouse-gh

clickhouse-gh Bot commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.00% 84.00% +0.00%
Functions 91.10% 91.10% +0.00%
Branches 76.50% 76.60% +0.10%

Changed lines: 99.01% (100/101) · Uncovered code

Full report · Diff report

@SmitaRKulkarni SmitaRKulkarni left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikitamikhaylov nikitamikhaylov enabled auto-merge May 5, 2026 15:27
@nikitamikhaylov nikitamikhaylov disabled auto-merge May 5, 2026 15:27
@nikitamikhaylov nikitamikhaylov added this pull request to the merge queue May 5, 2026
Merged via the queue into ClickHouse:master with commit 2d2e88c May 5, 2026
165 checks passed
@robot-ch-test-poll robot-ch-test-poll added the pr-synced-to-cloud The PR is synced to the cloud repo label May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RBAC: GRANT TABLE ENGINE ON * fails with table_engines_require_grant=false

6 participants