Native `CREATE TABLE` / `DROP TABLE` for DataLakeCatalog; optionally prune on `DROP` by zvonand · Pull Request #98670 · ClickHouse/ClickHouse · GitHub
Skip to content

Native CREATE TABLE / DROP TABLE for DataLakeCatalog; optionally prune on DROP#98670

Open
zvonand wants to merge 15 commits into
ClickHouse:masterfrom
zvonand:zvonand-create-iceberg-table
Open

Native CREATE TABLE / DROP TABLE for DataLakeCatalog; optionally prune on DROP#98670
zvonand wants to merge 15 commits into
ClickHouse:masterfrom
zvonand:zvonand-create-iceberg-table

Conversation

@zvonand

@zvonand zvonand commented Mar 3, 2026

Copy link
Copy Markdown
Contributor

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Supports CREATE TABLE, CREATE TABLE … AS source, and DROP TABLE for DataLakeCatalog; DROP TABLE can request catalog-side data purge via the new data_lake_delete_data_on_drop setting.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@zvonand zvonand force-pushed the zvonand-create-iceberg-table branch from 5133bc3 to a81e99e Compare March 4, 2026 17:59
@zvonand zvonand changed the title Allow to create Iceberg table in a DataLakeCatalog and allow to prune such tables on DROP. Support Iceberg table creation in DataLakeCatalog; optionally prune them on DROP Mar 26, 2026
@alesapin alesapin added the can be tested Allows running workflows for external contributors label Mar 26, 2026
@alesapin alesapin self-assigned this Mar 26, 2026
@clickhouse-gh

clickhouse-gh Bot commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-improvement Pull request with some product improvements label Mar 26, 2026
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp Outdated
Comment thread docs/en/engines/database-engines/datalake.md
@zvonand zvonand force-pushed the zvonand-create-iceberg-table branch from b3846c8 to b381484 Compare March 26, 2026 15:10
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp
@zvonand zvonand force-pushed the zvonand-create-iceberg-table branch from b381484 to 97a60e7 Compare March 31, 2026 12:22
@alexey-milovidov

Copy link
Copy Markdown
Member

The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix.

Comment thread src/Databases/DataLake/DatabaseDataLake.cpp Outdated
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp Outdated
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp
Comment thread src/Databases/DataLake/RestCatalog.cpp Outdated
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp Outdated
Comment thread src/Interpreters/InterpreterCreateQuery.cpp Outdated
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp Outdated
Comment thread src/Interpreters/InterpreterCreateQuery.cpp Outdated
Comment thread tests/integration/test_database_iceberg/test.py Outdated
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp Outdated
Comment thread src/Databases/DataLake/RestCatalog.cpp Outdated
Comment thread tests/integration/test_database_iceberg/test.py Outdated
Comment thread src/Databases/DataLake/RestCatalog.cpp Outdated
Comment thread src/Interpreters/InterpreterCreateQuery.cpp Outdated
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp
@zvonand zvonand changed the title Support Iceberg table creation in DataLakeCatalog; optionally prune them on DROP Native CREATE TABLE / DROP TABLE for DataLakeCatalog; optionally prune on DROP May 19, 2026
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp
@zvonand zvonand force-pushed the zvonand-create-iceberg-table branch from a0bf1f2 to b3cb15b Compare June 11, 2026 23:15
@zvonand zvonand force-pushed the zvonand-create-iceberg-table branch from b3cb15b to ce5a19c Compare June 11, 2026 23:15
Comment thread src/Interpreters/InterpreterCreateQuery.cpp
zvonand added a commit to Altinity/ClickHouse that referenced this pull request Jun 11, 2026
- Port \`IcebergPathResolver::reverseResolve\` from upstream (predates the
  PR upstream but was missing on this branch).
- Handle the Altinity-only \`S3_TABLES\` catalog type in
  \`getLocationSchemeForTableCreation\` (it is always S3-backed).
- Drop the \`unique_key\` check in \`DatabaseDataLake::createTable\`:
  \`ASTStorage\` has no \`UNIQUE KEY\` clause on this branch.
- \`getInMemoryMetadataPtr\` does not take a context argument on this
  branch.

ClickHouse#98670

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@zvonand

zvonand commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Findings
❌ Blockers
[src/Interpreters/InterpreterCreateQuery.cpp:2111] CREATE TABLE ... AS SELECT reaches the engine-less DataLakeCatalog branch, but the early return true skips the normal fillTableIfNeeded path. A query such as CREATE TABLE catalog.\ns.t AS SELECT 1 SETTINGS allow_database_iceberg=1can succeed while creating an empty Iceberg table instead of inserting the selected rows. Suggested fix: either rejectcreate.select && !create.is_create_emptyfor this path, or run the same insert-fill step afterdatabase->createTable succeeds.

This is not a valid concern; see #98670 (comment)

Comment thread src/Databases/DataLake/RestCatalog.cpp Outdated
Comment thread src/Storages/ObjectStorage/DataLakes/Iceberg/Utils.cpp
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp
Comment thread src/Databases/DataLake/DatabaseDataLake.cpp Outdated
alexey-milovidov and others added 9 commits June 29, 2026 11:17
`getPartitionField` serialized `icebergBucket(0, id)`, `icebergBucket(-1, id)`,
and `icebergTruncate(0, id)` into `bucket[0]`, `bucket[-1]`, and `truncate[0]`.
The Iceberg spec requires a positive number of buckets and a positive truncate
width; an invalid transform produces metadata that some catalogs (e.g. Glue)
register, leaving an unreadable Iceberg table. Validate the argument before
metadata generation so such a `CREATE TABLE ... PARTITION BY ...` is rejected.

Addresses an AI Review finding on ClickHouse#98670

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
In `writeMetadataFileAndVersionHint`, when another writer already committed an
equal-or-newer version, we remove the just-written `vN-<uuid>.metadata.json` so
the version-hint resolver does not later pick this uncommitted file. Previously a
failure to remove it was logged and swallowed, and the function returned `false`
- the same signal a clean loss returns - so the caller retried as if the file
were gone while the uncommitted metadata could still be resolved as the latest
version. Let the removal exception propagate so the operation fails closed.

Addresses an AI Review finding on ClickHouse#98670

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`RestCatalog::createTable` passed the table location (base/namespace/table) to
`createNamespaceIfNotExists`, which stored it as the namespace's
`properties.location`. That made the namespace default warehouse path point at
the first table's directory, so another client creating a table in the same
namespace without an explicit location could be placed under the first table's
path. Strip the trailing table-name segment so the namespace location is
base/namespace.

Addresses an AI Review finding on ClickHouse#98670

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A catalog-side `DROP TABLE` removes remote catalog metadata and, when
`data_lake_delete_data_on_drop` is set, can request deletion of the underlying
data, but the only success log was `TRACE`. Raise it to `INFO` so accidental
drops or catalog-side data purges leave an operationally visible audit trail.

Addresses an AI Review finding on ClickHouse#98670

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`test_create_table_invalid_partition_transforms` checks that
`icebergBucket(0, id)`, `icebergBucket(-1, id)`, and `icebergTruncate(0, id)`
are rejected and leave no table registered in the catalog, while a valid
`icebergBucket(8, id)` succeeds.

`test_create_table_namespace_location` checks that the REST namespace
`location` property points at the namespace base rather than the first table's
directory.

Tests for AI Review findings on ClickHouse#98670

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…erg-table

# Conflicts:
#	src/Databases/DataLake/DatabaseDataLake.h
#	src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp
#	src/Storages/ObjectStorage/DataLakes/Iceberg/Mutations.cpp
Both this branch and master independently added a
`virtual bool isDatalakeCatalog() const` method to `IDatabase`, at
different locations, so the merge left two identical declarations and
`class member cannot be redeclared`. Keep the one master placed next to
`isExternal`/`isRemoteDatabase` and drop the duplicate.
The setting was recorded in the 26.6 bucket, but 26.6 is already
released and does not contain it; the current development version is
26.7. Keeping it in 26.6 makes the settings-history consistency check
(`02995_new_settings_history`) disagree with the released 26.6 binary.
@alexey-milovidov

Copy link
Copy Markdown
Member

Pushed updates addressing the outstanding AI-review findings, plus a fresh master merge (the branch was ~2 weeks behind and master has since heavily refactored the Iceberg mutation write path).

AI-review blockers/majors (all four addressed):

  • Fail-closed on stale metadata cleanup (Utils.cpp, writeMetadataFileAndVersionHint): the removeObjectIfExists on the loser-cleanup path no longer swallows the exception — it propagates so the write is reported as failed instead of silently leaving an uncommitted vN-<uuid>.metadata.json that the version-hint resolver could later pick. (7b1903e)
  • Operational audit log for drops/purges (DatabaseDataLake::dropTable): successful catalog drops are now logged at INFO with namespace, table and the purge flag. (70970a4)
  • REST namespace default location (RestCatalog::createTable): the namespace properties.location is now the namespace base (<base>/<namespace>), not the first table's directory. (18b55df)
  • Invalid partition transforms (getPartitionField): bucket[N] / truncate[N] now reject non-positive N with BAD_ARGUMENTS before any metadata is written, so icebergBucket(0, …), icebergBucket(-1, …) and icebergTruncate(0, …) can no longer register an unreadable Iceberg table. (d9f3f2c)
  • Tests for the last two (invalid transforms + namespace location) added to test_database_iceberg. (a2f4232)

The changelog entry already uses the correct data_lake_delete_data_on_drop name.

Merge with master: resolved three conflicts —

  • Mutations.cpp: master collapsed the two-snapshot (POSITION_DELETE + DATA) write into a single writeMetadataFiles call. I re-applied this branch's fix on top of that structure: the single generateNextMetadata now receives the previous committed metadata path (via a new previous_metadata_file_path parameter threaded from mutate), instead of the newly-generated metadata_info.path, so the metadata-log entry references the actual predecessor.
  • IcebergMetadata.cpp: kept the data_lake_delete_data_on_drop declaration and master's new compaction-setting declarations; dropped the now-obsolete iceberg_delete_data_on_drop extern (it is only an alias after the rename).
  • DatabaseDataLake.h: kept both isRemoteDatabase and isDatalakeCatalog overrides.

Follow-ups from the merge: removed a duplicate IDatabase::isDatalakeCatalog declaration (both this branch and master added it, at different places → class member cannot be redeclared), and moved the new setting from the 26.6 to the 26.7 settings-history bucket (26.6 is released and does not contain it).

All 16 touched translation units verified with -fsyntax-only. No self-merge (unreviewed feature + non-trivial merge of the mutation write path).

`test_drop_table_purge` constructs a `Minio` client but the `minio`
package was never imported, so `ruff check` failed with
`F821 Undefined name Minio`. Add `from minio import Minio` to match the
import used by `tests/integration/helpers/cluster.py`.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=98670&sha=60c079ba48738ecaf5eb567865aea45cb048a4e6&name_0=PR&name_1=Style%20check

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
return fmt::format("abfss://{}@{}/{}/{}/{}", container, account_host, extra_path, namespace_name, table_name);
}

if (location_scheme == "s3")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constructTableLocation now derives the s3://... table location from uri.getPath() only. That breaks configurations using storage_uri_style = 'virtual_hosted' (which DatabaseDataLakeSettings already documents), for example https://warehouse-rest.minio.example.com/prefix: here the bucket lives in the host, so path is just prefix (or empty) and this code either creates s3://prefix/ns/table or throws does not contain a bucket.

Existing DataLakeCatalog reads work with that endpoint style, so CREATE TABLE would now be unavailable or generate the wrong table location for a supported configuration. Please derive the bucket from the host when virtual-hosted style is configured, or reject that endpoint style explicitly before advertising CREATE TABLE support.

throw Exception(ErrorCodes::BAD_ARGUMENTS, "Mutations with ON CLUSTER are not allowed for KeeperMap tables");

if (table_id)
checkDatabaseSupportsOnClusterDDL(DatabaseCatalog::instance().tryGetDatabase(table_id.database_name));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkDatabaseSupportsOnClusterDDL only runs here when the initiator resolved the table locally (if (table_id)). If ALTER TABLE dl.\ns.t` ... ON CLUSTERis issued from a node that does not have thedl DataLakeCatalogdatabase/table locally,table_id` stays empty, this guard is skipped, and the query is still fanned out to every worker.

Each worker then commits the same shared-catalog mutation, which is the same duplicate-mutation race the new ON CLUSTER ban is supposed to prevent for catalog-backed tables. Please reject DataLakeCatalog ON CLUSTER by database name before the local lookup, or re-check on the worker before executing the mutation.

order_by,
context_);

catalog->createTable(namespace_name, table_name, /* metadata_path */ "", metadata_content);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path is now generic for every DataLakeCatalog, but only RestCatalog and GlueCatalog currently override ICatalog::createTable / dropTable. hive, unity, onelake, biglake, and paimon_rest still hit the base ICatalog implementation and throw NOT_IMPLEMENTED, even though the new docs and changelog entry describe CREATE TABLE / DROP TABLE as DataLakeCatalog functionality in general.

Please either gate these statements to the catalog types that actually implement catalog mutations, or narrow the user-facing contract to REST / Glue only.

columns_declare_list->set(columns_declare_list->columns, columns_expression_list);
create_query.set(create_query.columns_list, columns_declare_list);

for (const auto & column : properties.columns)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CREATE TABLE ... AS source can still silently drop source-column COMMENT, CODEC, column TTL, SETTINGS, and STATISTICS. properties.columns already carries those fields here, but this AST rebuild only copies name, type, and defaults, so DatabaseDataLake::createTable never sees the unsupported modifiers and the Iceberg table is created with weaker semantics than the source.

Could we reuse formatColumns(properties.columns) (or otherwise copy these fields into the synthesized ASTColumnDeclaration) so the existing rejection path fires for CTAS too?

@clickhouse-gh

clickhouse-gh Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.60% 85.50% -0.10%
Functions 92.70% 92.70% +0.00%
Branches 77.70% 77.70% +0.00%

Changed lines: Changed C/C++ lines covered: 337/598 (56.35%) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants