iframe-proxy

amosbird · 2026-03-02T06:54:06Z

Summary

During MergeTree INSERT, columns whose values are entirely type-defaults (e.g., all zeros for UInt64, all empty strings for String, all NULLs for Nullable) are detected and excluded from the part's column list before constructing MergedBlockOutputStream. This avoids writing unnecessary .bin files (Wide parts) or data streams (Compact parts), saving disk space for sparse-update workloads where most columns in each INSERT are left at their type's default. The optimization is opt-in via the MergeTree setting skip_empty_columns_on_insert (off by default). It additionally requires serialization_info_version to be set to with_missing_columns (the format version that records frozen defaults for missing columns), so that a cluster pinned to a lower version for compatibility never writes parts that older servers cannot read.

The block itself is passed intact to the writer, so skip indices, projections, primary index, and min-max index are all computed from the full data. Reading a part that lacks a column fills it with type-defaults automatically — the same mechanism used by ALTER TABLE ADD COLUMN on existing parts.

To keep reads stable, a structured missing_columns array is recorded in the part's serialization.json (a new WITH_MISSING_COLUMNS serialization-info version). Each entry carries the column name and a type_default marker. On read, fillMissingColumns consults this marker and fills a missing column with its type-default even if the column later gains a new DEFAULT expression, so that a subsequent ALTER MODIFY COLUMN ... DEFAULT does not retroactively change the values that were actually inserted. The marker is propagated through merges, mutations, and on-the-fly ALTER RENAME COLUMN, so the inserted type-defaults survive part-lifecycle operations.

The MissingColumnInfo struct also reserves a DefaultKind::Expression variant for future use (issue #92475: ALTER MODIFY DEFAULT freezes old expression into parts). This PR only writes type_default markers; reading an expression marker throws CORRUPTED_DATA to fail closed until Phase 2 implements expression evaluation.

Related: #4968, #92475

On-disk format (serialization.json):

{
  "missing_columns": [
    { "name": "b", "default": "type_default" }
  ],
  "version": 2
}

Changes:

Add MergeTree setting skip_empty_columns_on_insert (default false).
Add IColumn::hasOnlyTypeDefaults with optimized overrides for ColumnVector/ColumnDecimal (memoryIsZero), ColumnString/ColumnArray (zero offsets), ColumnNullable (all-1 null map), ColumnSparse (no stored non-default values), and ColumnTuple (delegates to sub-columns).
Filter all-default columns in MergeTreeDataWriter::writeTempPartImpl (skipEmptyColumnsOnInsert). Columns with a DEFAULT/MATERIALIZED/ALIAS expression are never skipped. A column is skipped only when IDataType::getDefault() coincides with the column's zero representation (isDefaultAt(0) on a column filled via getDefault()), which correctly excludes types like Date32 (type-default 1900-01-01 ≠ memory-zero) and Enum (first declared value ≠ 0). Patch parts are excluded, and at least one physical column is always kept.
Gate the optimization on serialization_info_version >= with_missing_columns. The populating step is authoritative about the part format version, so SerializationInfoByName::getVersion never silently upgrades a part past the configured value (which would make older servers reject it during a rolling upgrade).
Record the missing columns in serialization.json via SerializationInfoByName (new WITH_MISSING_COLUMNS version). Only type_default markers are written; expression markers are rejected on read until Phase 2. The list is written in sorted order so that identical parts produce identical checksums on different replicas.
Propagate the missing-columns marker through the part lifecycle: through merges (MergeTask, for columns that end up absent from the merged part), through mutations (MutateTask::getColumnsForNewDataPart, including renames of missing columns), through compact-part renames (splitAndModifyMutationCommands), and through on-the-fly ALTER RENAME COLUMN on read (IMergeTreeReader::fillMissingColumns translates the requested name back through alter_conversions).
Add a stateless test with 46 cases covering: basic skip; all-default keeps one column; merges (horizontal and vertical); mutation; Nullable; key columns; DEFAULT expression not skipped; Array; Tuple; ColumnSparse source; compact parts; LowCardinality; stable values across ALTER ... DEFAULT; a non-zero-default Enum; marker across mutation/merge/rename after ALTER DEFAULT; version gate; DETACH/ATTACH TABLE; DETACH/ATTACH PARTITION; BACKUP/RESTORE; FREEZE; CHECK TABLE; MATERIALIZE COLUMN; CLEAR COLUMN; lightweight DELETE; chained mutations; INSERT SELECT; REPLACE PARTITION; ATTACH PARTITION FROM; ALTER ADD/DROP COLUMN; projections; FixedString; Map; mixed parts merge; Date/DateTime; pre-ADD-COLUMN + skip merge; type-changing mutation; compact-part rename+mutation; Date32 regression.
Add an integration test (test_skip_empty_columns) with 6 cases: replicated consistency; merge marker propagation across replicas; mixed-version version gate; backward-compat fallback for old parts; restart durability; REPLACE PARTITION across replicated tables.

Changelog category (leave one):

New Feature

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

MergeTree INSERT can now skip writing columns whose values are entirely type-defaults (zeros, empty strings, NULLs), saving disk space for sparse-update workloads. Enabled by the MergeTree setting skip_empty_columns_on_insert together with serialization_info_version = 'with_missing_columns'. Missing columns carry frozen defaults in serialization.json, so a later ALTER MODIFY COLUMN ... DEFAULT does not retroactively change the inserted values.

Documentation entry for user-facing changes

Documentation is not required (bug fix, no new user-facing feature)

clickhouse-gh · 2026-03-02T06:54:43Z

amosbird · 2026-03-20T00:06:56Z

Hi @Fgrtue, I noticed you’ve assigned yourself to #92475. Would you be interested in taking a look at this one as well?

Fgrtue · 2026-03-20T12:27:28Z

@amosbird thank for letting me know! Indeed, I will take a look.

Copilot

Pull request overview

Adds an INSERT-time optimization for MergeTree to avoid writing data streams/files for columns that are entirely type-default within an inserted block, reducing disk usage for sparse-update patterns.

Changes:

Introduces skip_empty_columns_on_insert MergeTree setting and applies it in MergeTreeDataWriter::writeTempPartImpl by filtering all-default columns from the part’s written column list.
Adds IColumn::hasOnlyDefaults() (with implementations/overrides for several column types) to efficiently detect all-default columns.
Adds a new stateless test and reference output covering several correctness scenarios around missing columns.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`src/Storages/MergeTree/MergeTreeDataWriter.cpp`	Filters all-default columns before constructing `MergedBlockOutputStream`; toggles `reset_columns` when filtering occurred.
`src/Storages/MergeTree/MergeTreeSettings.cpp`	Documents new MergeTree setting `skip_empty_columns_on_insert`.
`src/Core/SettingsChangesHistory.cpp`	Records the new setting in settings change history.
`src/Columns/IColumn.h` / `src/Columns/IColumn.cpp`	Adds and implements (via `IColumnHelper`) the new `hasOnlyDefaults()` API.
`src/Columns/ColumnVector.h`	Adds a fast-path `hasOnlyDefaults()` using `memoryIsZero`.
`src/Columns/ColumnConst.h` / `ColumnFixedString.h` / `ColumnDecimal.h`	Adds `hasOnlyDefaults()` overrides for common fixed-size representations.
`src/Columns/ColumnLazy.h/.cpp`, `ColumnUnique.h`, `ColumnCompressed.h`, `ColumnBLOB.h`, `ColumnFunction.h`, `ColumnAggregateFunction.h`, `IColumnDummy.h`	Adds required `hasOnlyDefaults()` overrides/implementations (some throw / return conservative defaults).
`tests/queries/0_stateless/04006_skip_empty_columns_on_insert.sql`	New stateless test cases for the optimization.
`tests/queries/0_stateless/04006_skip_empty_columns_on_insert.reference`	Expected output for the new stateless test.

Copilot · 2026-03-20T13:39:28Z

+        if (!empty_columns.empty())
+        {
+            auto filtered = columns.eraseNames(empty_columns);
+            if (!filtered.empty())
+            {
+                columns = std::move(filtered);
+                has_empty_columns = true;
+                for (const auto & name : empty_columns)
+                    infos.erase(name);
+            }


The filtering logic won’t skip any columns when all columns in the block are defaults: columns.eraseNames(empty_columns) will return an empty list, and the if (!filtered.empty()) guard prevents applying the filter at all. This contradicts the feature’s intent (skip all-default columns) and keeps writing unnecessary files for fully-default inserts. Consider keeping at least one “anchor” column (e.g., first physical column or a key column) and removing the rest, and update the test case to assert the reduced on-disk columns set.

Copilot · 2026-03-20T13:39:28Z

+        const auto & columns_description = metadata_snapshot->getColumns();
+        NameSet empty_columns;
+        for (const auto & col : block)
+        {
+            auto col_default = columns_description.getDefault(col.name);
+            if (col_default && col_default->expression)
+                continue;
+            if (col.column->hasOnlyDefaults())
+                empty_columns.insert(col.name);
+        }


empty_columns is collected by iterating over every column in block. At this point columns is getAllPhysical().filter(block.getNames()), while block can also contain temporary/subcolumns added for sorting key / skip-index expressions. Those non-physical columns can be marked “empty” even though they will never be written, which can cause has_empty_columns to become true without actually removing any physical columns and trigger reset_columns unnecessarily. Consider iterating over columns (and looking up each in block) instead of iterating over block.

Copilot · 2026-03-20T13:39:28Z

+    bool hasOnlyDefaults() const override
+    {
+        return memoryIsZero(data.data(), 0, data.size() * sizeof(T));
+    }


ColumnVector<T>::hasOnlyDefaults() uses a raw memoryIsZero check, but isDefaultAt() uses data[n] == T{}. For floating-point types, -0.0 == 0.0 is true, so a column containing only -0.0 values is “all-default” per isDefaultAt() but will return false here (bytes are not all zero). This makes hasOnlyDefaults() inconsistent with the column’s own default semantics. Consider falling back to an isDefaultAt() loop for floating-point T (or otherwise aligning the definition).

Fgrtue

I wanted to suggest to add custom hasOnlyDefaults() implementation for the following columns:

ColumnString -- if I understand correctly, we could just check that the offsets are all 0
ColumnNull -- probably memoryIsByte could have been used?
ColumnArray -- likely we could also check that offsets are empty (equal to 0), as in case of ColumnString
ColumnSparse -- it seems that only checking the offsets would be enough
ColumnTuple -- at the moment we will call isDefaultAt() NxM times irrespectively if the column inside the tuple have optimized custom version of hasOnlyDefaults(). If we rewrite hasOnlyDefaults() method to just propagating the call to the columns that are stored in tuple, we might get some performance improvement.

amosbird · 2026-03-25T08:50:19Z

@Fgrtue It seems the CH Inc sync requires manual resolution.

Fgrtue · 2026-03-25T09:16:34Z

@amosbird should be done. I wanted to take a second quick look at the PR today, I will update you on the results. Just to make sure, that's the final version so far, right?

amosbird · 2026-03-25T09:23:17Z

Just to make sure, that's the final version so far, right?

Yes. (I mistakenly configured Copilot to force push, which appears to have overridden the existing reviews. Sorry about that.)

Fgrtue · 2026-03-25T15:49:05Z

@amosbird, did you have a chance to see my previous review suggestion about adding optimized hasOnlyTypeDefaults() for ColumnString, ColumnNull, ColumnArray, ColumnSparse, and ColumnTuple? Do you think if would make sense to add them?

amosbird · 2026-03-25T18:32:09Z

Thanks for the tips! I've added optimized hasOnlyTypeDefaults for all five types in the latest push:

ColumnString — memoryIsZero on the offsets array (all empty strings ⟹ all offsets are zero)
ColumnNullable — memoryIsByte(..., 1) on the null map (all NULL ⟹ all bytes are 1)
ColumnArray — memoryIsZero on the offsets array (all empty arrays ⟹ all offsets are zero)
ColumnSparse — checks whether offsets is empty (no non-default values stored)
ColumnTuple — delegates to each sub-column's hasOnlyTypeDefaults with early exit

Test cases 5, 8, 9, and 10 exercise ColumnNullable, ColumnArray, ColumnTuple, and ColumnSparse respectively. ColumnString is covered by the existing cases (1, 3, 4).

Fgrtue · 2026-03-25T18:59:52Z

        (data.supportsTransactions() && context->getCurrentTransaction()) ? context->getCurrentTransaction()->tid : Tx::PrehistoricTID,
        block.bytes(),
-        /*reset_columns=*/ false,
+        /*reset_columns=*/ has_empty_columns,


I am trying to understand whether we need to set reset_columns to true in case we have empty columns.

So far I found the following that we use reset_columns in three contexts:

MergedBlockOutputStream.cpp:40

MergedBlockOutputStream.cpp:221

MergedBlockOutputStream.cpp:442

In two of them (on line 40 and 442) it seems that we won't get any new information in infos. The third one (221) I did not verify completely (I will), but on the first glance it looks that reset_columns doesn't influence that part as well.

Could you check please if it is needed and why?

You are right — reset_columns = true is not needed here. I traced through all three sites:

Constructor (IMergedBlockOutputStream.cpp:40-41): Initializes new_serialization_infos = SerializationInfoByName(columns_list, info_settings) — but note info_settings has choose_kind = false (line 32), while we already computed the real serialization info with choose_kind = true at MergeTreeDataWriter.cpp:875 and set it via new_data_part->setColumns(columns, infos, ...) at line 893.

writeImpl (MergedBlockOutputStream.cpp:442-443): new_serialization_infos.add(block) — accumulates stats from the written block, but this is the same block we already ran infos.add(block) on at line 882. So it just recomputes equivalent statistics.

finalizePartAsync (MergedBlockOutputStream.cpp:221-231): This does three things:

serialization_infos.replaceData(new_serialization_infos) — replaces only the data member (not kind_stack) with equivalent stats from step 2.

removeEmptyColumnsFromPart(new_part, part_columns, new_part->expired_columns, ...) — this is a no-op because expired_columns is empty. It is only populated in the merge path (MergeTask.cpp:610-626) for TTL-expired columns, never during INSERT.

new_part->setColumns(part_columns, serialization_infos, ...) — redundant, since we already called setColumns with the correct filtered columns and infos at line 893.

So indeed the entire reset_columns block is a no-op in our INSERT path. I will change it to false.

Fgrtue

The tests are good. I wanted to suggest to add the following test cases:

Testing for compact parts (i.e. use min_bytes_for_wide_part != 0, min_rows_for_wide_part = 0) for a) skipping column b) merging parts
Could we also add a check for LowCardinality column as this is a pretty often use case?
Regarding merges test, what do you think if we test both type of merges: vertical and horizontal ones?

Fgrtue · 2026-03-26T14:14:28Z


+bool ColumnSparse::hasOnlyTypeDefaults() const
+{
+    return _size == 0 || getOffsetsData().empty();


I am thinking of a case when sparce column consists only of of non-default type (for example 5). It seems to me that we will not distinguish between sparce column with just one (any) value.

Moreover, the generic version IColumnHelper<Derived, Parent>::hasOnlyTypeDefaults() seems to give a wrong result as well.

Even though now it does not lead to data corruption (i.e. returning type defaults instead of the DEFAULT elements at values[0]), it seems that this is a wrong implementation for this function.

If my reasoning is right, we could fix this by checking that the element at 0 index of values is default type itself, i.e values->isDefaultAt(0).

Good catch! Fixed: added && values->isDefaultAt(0) so we verify the actual stored default value, not just the absence of offsets. The generic IColumnHelper::hasOnlyTypeDefaults() fallback also gives wrong results for ColumnSparse (since isDefaultAt(n) just checks getValueIndex(n) == 0), but the specialized override now handles this correctly.

Fgrtue · 2026-03-26T14:16:28Z

    {
        addSettingsChanges(merge_tree_settings_changes_history, "26.4",
        {
+            {"skip_empty_columns_on_insert", false, false, "New setting to skip writing all-default columns on INSERT"},


Probably it would be more accurate way to say:

Suggested change

{"skip_empty_columns_on_insert", false, false, "New setting to skip writing all-default columns on INSERT"},

{"skip_empty_columns_on_insert", false, false, "New setting to skip writing all type default columns on INSERT"},

Fgrtue · 2026-03-26T15:17:49Z

+ORDER BY column;
+
+SELECT 'case7_data';
+SELECT key, a, b FROM t_skip_empty_default_expr ORDER BY key;


Should we SELECT key, a, b, c here? Or is this intentional?

Added c to the SELECT. The reference now shows 1 5 0 50, confirming the MATERIALIZED expression a * 10 is correctly evaluated.

amosbird · 2026-03-30T04:17:33Z

Added 4 new test cases addressing the review:

Case 11: compact parts (skip + merge, with min_bytes_for_wide_part = 1000000000)
Case 12: LowCardinality(String) (generic isDefaultAt through dictionary)
Case 13: vertical merge (enable_vertical_merge_algorithm = 1)
Case 14: horizontal merge (enable_vertical_merge_algorithm = 0)

alexey-milovidov · 2026-04-07T00:22:36Z

The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix.

alexey-milovidov · 2026-04-09T21:01:06Z

The Can't adjust last granule error in CI is a known issue. The fix is in #101641

CurtizJ

This introduces an inconsistency: if a user inserts a column with all-zero values and then changes its default expression, the values returned on read will change as well. The same inconsistency already exists for columns added via ADD COLUMN whose default expression is later modified, so this is not a new problem, but it may be worth avoiding in this case.

Maybe we can store a marker in serialization_infos.json that records whether the column was physically written, so the reader can fill in defaults correctly?

amosbird · 2026-04-28T11:25:13Z

Maybe we can store a marker in serialization_infos.json that records whether the column was physically written, so the reader can fill in defaults correctly?

@CurtizJ This might conflict with the proposal in #92475

…ERT and merge When all values of a named-Tuple subfield in a part are type-defaults, the writer omits that subfield's stream files and narrows the part's columns.txt Tuple type so it no longer mentions the subfield. Reads see the narrowed Tuple type and use `CAST(narrowed_tuple, full_tuple)` to materialize defaults, relying on the metadata-only ALTER work in ClickHouse#107305. This optimization is most useful for `PARTITION BY` schemes where different partitions populate different subsets of a wide schema's subfields: the on-disk part keeps only the substreams whose subfield actually appears in that partition. Approach - Reuse the existing whole-column pruning path (`IMergedBlockOutputStream::removeEmptyColumnsFromPart` consuming `new_data_part->expired_columns`). - Extend that path to accept dotted subfield names (`data.c2s.gold`) and narrow the column's Tuple type via the new `narrowDataTypeByExpiredSubstreams` helper in `DataTypes/Utils`. - After the prune pass, keep `columns_substreams.txt` consistent with the on-disk files via the new `ColumnsSubstreams::removeSubstreams` helper. - Preserve each kept subfield's `SerializationInfo` (its sparse / default kind and per-element `num_rows` / `num_defaults`) when narrowing the enclosing Tuple, via the new `narrowSerializationInfo` helper. - INSERT: `MergeTreeDataWriter` traverses each named-Tuple column with the new `IColumn::hasOnlyTypeDefaults` to spot all-default subtrees and contributes their dotted paths to `expired_columns`. - Merge (Sub-case A): `MergeTask::prepare` computes the union of leaf substreams across all source parts and marks any leaf absent from every source as expired in the merged part. This is monotonic: a merged part never re-materializes default values for a subfield that was consistently pruned in the inputs. Why top-level all-default columns are intentionally NOT pruned If we erased a top-level Tuple column whose value is entirely default, the part would semantically lose that column ("missing column" — equivalent to a column that was added by a later `ALTER ADD COLUMN`). A subsequent `ALTER MODIFY COLUMN ... DEFAULT <new_expr>` would then re-materialize the column with the NEW default expression on read, retroactively changing historical data. That is exactly the quirk tracked by ClickHouse#92475 (`ALTER MODIFY ... DEFAULT` rewriting old parts). This PR sidesteps the problem by leaving top-level columns alone: subfield pruning only narrows the Tuple type of a column that still exists. The materialized 0 / '' / `[]` bytes of the kept columns pin the part's semantics; future `ALTER MODIFY ... DEFAULT` changes apply only to parts written after the ALTER, matching today's whole-column behavior. Named-Tuple subfields have no per-subfield DEFAULT expression syntax (`Tuple(a Int64 DEFAULT 5)` is not a valid type), so pruning a subfield can only ever fall back to the language's type-default (0 / '' / NULL). This is also why the optimization composes cleanly with the per-column DEFAULT RFC in ClickHouse#92475 (comment 4334850399): subfield pruning operates entirely below the column boundary the RFC will redefine. What is NOT touched - Compact parts: early return preserved; pruning only fires for Wide parts. - Patch parts: skipped (mirrors the existing whole-column behavior). - Mutate path: not pruned; mutations preserve the existing schema. - Top-level all-default columns: see note above. - `PR ClickHouse#98472`'s column-level `skip_empty_columns_on_insert` mechanism: only the `hasOnlyTypeDefaults` column primitives are lifted, none of its signalling layer (no `WITH_SKIPPED_COLUMNS` serialization version, no JSON `skipped_columns` field, no DEFAULT-expression interaction). Gate - `enable_tuple_subfield_pruning` (default true) gates the entire feature in `MergeTreeSettings`. The history entry is recorded under 26.6. Compatibility - No on-disk format change: parts written by this PR are readable by any server that has the metadata-only-ALTER work in ClickHouse#107305. Tests - `tests/queries/0_stateless/04320_tuple_subfield_pruning.sql` exercises 36 cases: flat / nested Tuple, Nullable wrap, Array(Tuple) (all-empty and non-empty), Map(K, Tuple), `LowCardinality(String)`, deep customer-like schema, `PARTITION BY` per-partition narrowing, setting OFF, Compact-part preservation, two-part merge variants (both pruned, one pruned, different subfields pruned), `INSERT SELECT` / async INSERT / materialized view, `ReplacingMergeTree` merge, vertical merge, `LWD`, `ALTER MODIFY ADD subfield + INSERT`, `ALTER UPDATE` mutation on narrowed part, multi-granule part, `DETACH / ATTACH PARTITION`, top-level column with a dot in its name, force-sparse + pruning interaction, subcolumn reads of pruned subfields, `CHECK TABLE` on a pruned part, and `bytes_on_disk` comparison. ### Documentation entry for user-facing changes - [x] Documentation is not required. ### Changelog category (leave one): - Improvement ### Changelog entry: Automatically prune named-Tuple subfields whose values in a part are entirely type-defaults: the writer omits their stream files and records a narrowed Tuple type in `columns.txt`; reads materialize defaults via `CAST`. Gated by the new MergeTree-level setting `enable_tuple_subfield_pruning` (default on).

… parent DB is missing `DataLakeConfiguration::getCatalog` (introduced by ClickHouse#100334) looked up the parent database in `DatabaseCatalog` and threw `LOGICAL_ERROR` ("Database X not found") when `tryGetDatabase` returned `nullptr`. That assertion is wrong: a missing database here is a transient runtime state, not a logical-invariant violation. Concretely it can fire during async metadata loading after a server restart (`AsyncLoader::worker` -> `DatabaseOrdinary::loadTableFromMetadata` -> `createStorageObjectStorage` -> `getCatalog`) when an unrelated table-load job in the same database has just thrown (for instance because of `cannot_allocate_thread_fault_injection_probability`) and the database has been detached as a result. Stress tests with thread-allocation fault injection have been hitting this LOGICAL_ERROR sporadically: `STID 2377-2a78`, 3 distinct unrelated PRs over 90 days (PR ClickHouse#98472 on 2026-04-09, PR ClickHouse#100958 on 2026-04-12, PR ClickHouse#102804 on 2026-04-30 - none of which touch this code or its callers). Production stack from PR ClickHouse#102804 stress-test (amd_debug): ``` 2026.04.30 05:17:39.895829 [ 6955 ] AsyncLoader::worker: Code: 439. DB::Exception: Cannot schedule a task: fault injected (...): Cannot attach table `test_1`.`test_max_size_drop` from metadata file ... 2026.04.30 05:17:40.099425 [ 6998 ] {} <Fatal> : Logical error: 'Database test_1 not found'. [stack: DataLakeConfiguration::getCatalog -> createStorageObjectStorage -> registerStorageIceberg -> StorageFactory::get -> createTableFromAST -> DatabaseOrdinary::loadTableFromMetadata -> AsyncLoader::worker] ``` Fix: combine the two null-checks. `dynamic_pointer_cast` already returns `nullptr` for a null input, so the function naturally returns `nullptr` both for "DB not registered" and "DB is not `DataLakeCatalog`" - the same response either way. This matches the behaviour of `getCatalog` before ClickHouse#100334, restores backward compatibility for `Iceberg` engine tables hosted in regular `Atomic`/`Ordinary` databases, and removes the spurious LOGICAL_ERROR signal from stress-test reports without changing behaviour for the supported `DataLakeCatalog` -> `Iceberg` path. Local verification (debug build): - Compiles, server starts. - `CREATE TABLE iceberg_t ENGINE = IcebergLocal(...)` inside a regular `Atomic` database succeeds, DETACH/ATTACH database cycle succeeds, server restart with `async_load_databases=1` reloads the table without LOGICAL_ERROR. Report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=102804&sha=40e4eba7d14b8588106464e81b911e8de7a45dc6&name_0=PR&name_1=Stress%20test%20%28amd_debug%29 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

clickhouse-gh · 2026-06-17T08:23:02Z

+                continue;
+            /// Column is materialized by this mutation (present in updated_header),
+            /// so it is written in full and is no longer skipped.
+            if (updated_header.has(new_name))


A skipped column still represents real inserted values, so preserving its marker through every mutation that leaves it out of updated_header is unsafe for type-changing ALTER MODIFY COLUMN.

Concrete trace: insert b UInt64 = 0 so b is skipped, then run ALTER TABLE ... MODIFY COLUMN b Nullable(UInt64). splitAndModifyMutationCommands skips the READ_COLUMN command because the source part has no physical b file, so updated_header does not contain b here and the marker is preserved. Later reads synthesize the current type default, NULL, but a normal type mutation of the stored value 0 should produce 0 as Nullable(UInt64).

Please either materialize skipped columns for type-changing READ_COLUMN mutations, or only preserve the marker when the old type-default converted to the new type is provably equal to the new type default. A regression with UInt64 -> Nullable(UInt64) would catch this.

The on-the-fly window is now guarded, but the fully-materialized type change can't be fixed by dropping the marker: both the marker and normal missing-column handling yield the new type's default (e.g. NULL), while the correct value is convert(0, new_type) = 0. Only materializing the skipped column produces it. Root cause: the type-changing READ_COLUMN is filtered out at MutateTask.cpp:559 for physically-absent (skipped) columns, so it never reaches getColumnsForNewDataPart. Left as a design decision (force materialization vs. record skip-time type/value) — see #98472 (comment)

clickhouse-gh · 2026-06-17T08:26:13Z

+    /// unrelated mutation silently dropped the marker.
+    {
+        NameSet new_skipped_columns;
+        for (const auto & name : serialization_infos.getSkippedColumns())


CLEAR COLUMN needs to forget a skipped column too. Today a skipped column is absent from part_columns, so the DROP_COLUMN command with clear = true is ignored before it reaches the interpreter, and this loop then preserves the old skipped marker because updated_header does not contain the column.

Concrete trace: insert b = 0 so b is skipped, then ALTER TABLE ... MODIFY COLUMN b UInt64 DEFAULT 999, then ALTER TABLE ... CLEAR COLUMN b. CLEAR COLUMN should remove the stored value and make the row read as the current default 999, but preserving the marker keeps returning the inserted type-default 0.

Please treat DROP_COLUMN with clear = true as affecting skipped columns even when they have no physical files: either drop the marker so the current default is evaluated, or materialize the cleared value explicitly.

On-the-fly CLEAR COLUMN is now handled on read via isColumnDropped (66cd944). The fully-materialized case still needs the clear DROP_COLUMN to reach getColumnsForNewDataPart (it is filtered at MutateTask.cpp:566 for skipped columns), which is the same materialization decision as the type-change blocker — see #98472 (comment)

alexey-milovidov · 2026-06-19T03:58:32Z

Pushed 61c63f35469..cc3a0df89c6 (merged current master, resolving the MutateTask::getColumnsForNewDataPart conflict by keeping both the skipped-columns marker block and master's materialize_updated_column_serialization_infos block) and addressed part of the AI review:

Blocker 2 (IMergeTreeReader.cpp) + on-the-fly part of Blocker 4 (CLEAR COLUMN) — fillMissingColumns now skips the marker for any column reported by AlterConversions::isColumnDropped (this includes CLEAR COLUMN, which is a DROP_COLUMN with clear). So while a DROP/CLEAR is still pending (on-the-fly), a re-added b ... DEFAULT 999 reads 999 and a cleared b reads its current default instead of the stored type-default.
Blocker 3 (MergeTask.cpp) — the merge no longer carries the marker for source columns reported dropped by the part's AlterConversions, so a DROP + merge + re-ADD ... DEFAULT no longer resurrects a stale marker.
Major (MergeTreeSettings.cpp) — documented with_skipped_columns under serialization_info_version and fixed the string_serialization_version wording (with_types or newer).
PR metadata — body now says 21 test cases (incl. rename-then-merge) and uses Related: https://github.com/ClickHouse/ClickHouse/issues/4968.

These guards are scoped to the isColumnDropped case only, so they cannot change the existing 21 test outcomes.

Still open — needs a design decision (Blocker 1, and the fully-materialized case of Blocker 4):

The root cause is in MutateTask command splitting: for a column physically absent from the part (a skipped column), the type-changing READ_COLUMN is dropped at MutateTask.cpp:559 (part_columns.has(...)) and CLEAR's DROP_COLUMN at MutateTask.cpp:566, so neither reaches getColumnsForNewDataPart. Once the mutation materializes, the marker survives and the read is wrong.

For Blocker 1 this cannot be fixed by dropping the marker: after skipped b UInt64 = 0 then MODIFY COLUMN b Nullable(UInt64), both the marker path and normal missing-column handling produce the new type's default (NULL), whereas the correct value is convert(0, Nullable(UInt64)) = 0. Only materializing the skipped column at type-change time yields the right value (the eager rewrite reads b as old-type 0 from the source marker and converts it). So the choice is essentially: force materialization of a skipped column on a type-changing/clearing mutation, or record the skip-time type/value in the marker. I left this for you rather than guess, since it touches the mutation interpreter path.

I could not run a full local build (this checkout's master is ahead of my main worktree by 11 contrib submodules plus new libpng/xsimd), so the four changed files were verified with clang -fsyntax-only; CI runs the full build and the stateless suite. I also did not add the on-the-fly-window regressions, because forcing a deterministic pending-mutation state needs controlled materialization — happy to add them once the materialization direction above is decided.

clickhouse-gh · 2026-06-27T16:36:57Z

+                continue;
+            /// Column is materialized by this mutation (present in updated_header),
+            /// so it is written in full and is no longer skipped.
+            if (updated_header.has(new_name))


A skipped column still represents real inserted values, so MATERIALIZE COLUMN must not treat the missing physical file as "value was never stored". Today splitAndModifyMutationCommands admits MATERIALIZE_COLUMN when the part lacks column files; for a skipped b this makes MutationsInterpreter compute the current DEFAULT expression, updated_header.has(b) is true here, and this block drops the skipped marker. A concrete trace is: insert b UInt64 = 0 with skipping enabled, ALTER MODIFY COLUMN b UInt64 DEFAULT 999, then ALTER TABLE ... MATERIALIZE COLUMN b. Before materialization reads return the inserted 0, but the mutation writes 999, violating the existing MATERIALIZE COLUMN contract that past values for DEFAULT columns are not overwritten. Please either keep skipped columns out of this materialization path or materialize the value read through the skipped marker, and add a regression for this sequence.

clickhouse-gh · 2026-06-30T00:56:00Z

+                    /// ... DEFAULT 999, the newly added `b` must read 999, not the
+                    /// frozen default. Fall through to normal missing-column handling
+                    /// (which evaluates the DEFAULT expression) in that case.
+                    if (alter_conversions->isColumnDropped(name_in_part))


This drop guard checks the old physical name after the rename mapping, but AlterConversions records DROP COLUMN under the current name. For a part with missing_columns = ['b'], pending RENAME COLUMN b TO c, then DROP COLUMN c, name_in_part becomes b, isColumnDropped("b") is false, and the stale marker is trusted. After ADD COLUMN c UInt64 DEFAULT 999, the new c can read as the old inserted type-default 0. MergeTask has the same ordering before it translates the marker to the current name. Please check the dropped state both before and after rename, or normalize missing-marker names through the full rename/drop chain before preserving or trusting them; add a rename -> drop -> add regression.

clickhouse-gh · 2026-06-30T14:56:37Z

LLVM Coverage Report

Changed lines: Changed C/C++ lines covered: 275/321 (85.67%) · Uncovered code

Full report · Diff report

When skip_empty_columns_on_insert is enabled and serialization_info_version is set to 'with_missing_columns', columns whose values are entirely type-defaults (zeros, empty strings, NULLs) are omitted from MergeTree parts at INSERT time. A structured 'missing_columns' marker in serialization.json records the frozen default for each omitted column, so a later ALTER MODIFY COLUMN ... DEFAULT does not retroactively change the inserted values. Key components: - IColumn::hasOnlyTypeDefaults() with optimized overrides (memoryIsZero, offsets check, null map check, etc.) - skipEmptyColumnsOnInsert() in MergeTreeDataWriter filters columns using IDataType::getDefault() to match the read-path reconstruction - SerializationInfoByName::MissingColumnInfo struct with TypeDefault and Expression (reserved for Phase 2) variants - Read path (fillMissingColumns) consults the marker and fills type-default instead of evaluating the current DEFAULT expression - Marker propagation through merges, mutations, and ALTER RENAME COLUMN - Version gate: WITH_MISSING_COLUMNS = 2 in serialization_info_version Fixes: - Date32 type-default mismatch (getDefault() vs insertDefaultInto()) - Compact-part rename tracking for missing columns - Expression markers rejected on read (fail closed until Phase 2) Tests: - 83 stateless test labels across 3 files (types, mutations, lifecycle) - 6 integration tests (replication, mixed-version, restart, partitions)

clickhouse-gh Bot added the pr-feature Pull request with new product feature label Mar 2, 2026

clickhouse-gh Bot reviewed Mar 16, 2026

View reviewed changes

Comment thread src/Storages/MergeTree/MergeTreeDataWriter.cpp Outdated

amosbird force-pushed the skip-empty-columns branch from facf467 to 92d5f86 Compare March 19, 2026 09:31

amosbird marked this pull request as ready for review March 19, 2026 23:54

Fgrtue self-assigned this Mar 20, 2026

Fgrtue requested a review from Copilot March 20, 2026 13:32

Copilot started reviewing on behalf of Fgrtue March 20, 2026 13:33 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

Fgrtue reviewed Mar 20, 2026

View reviewed changes

Comment thread src/Columns/IColumn.h Outdated

Comment thread src/Storages/MergeTree/MergeTreeDataWriter.cpp Outdated

amosbird force-pushed the skip-empty-columns branch from 08e6a6d to ec0cdad Compare March 23, 2026 17:43

clickhouse-gh Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread src/Storages/MergeTree/MergeTreeSettings.cpp

Fgrtue reviewed Mar 25, 2026

View reviewed changes

Fgrtue reviewed Mar 26, 2026

View reviewed changes

clickhouse-gh Bot reviewed Mar 30, 2026

View reviewed changes

Comment thread src/Columns/ColumnString.h Outdated

CurtizJ reviewed Apr 20, 2026

View reviewed changes

clickhouse-gh Bot reviewed Apr 28, 2026

View reviewed changes

Comment thread src/DataTypes/Serializations/SerializationInfo.cpp Outdated

amosbird mentioned this pull request Apr 28, 2026

Changing the default expression for a column should materialize the old default in the old parts #92475

Open

groeneai mentioned this pull request Apr 30, 2026

Don't throw LOGICAL_ERROR from DataLakeConfiguration::getCatalog when parent DB is missing #103775

Merged

1 task

clickhouse-gh Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread src/Storages/MergeTree/MergeTask.cpp Outdated

clickhouse-gh Bot reviewed Jun 17, 2026

View reviewed changes

Comment thread src/Storages/MergeTree/MergeTreeSettings.cpp Outdated

clickhouse-gh Bot reviewed Jun 17, 2026

View reviewed changes

Comment thread src/Storages/MergeTree/IMergeTreeReader.cpp Outdated

clickhouse-gh Bot reviewed Jun 17, 2026

View reviewed changes

Comment thread src/Storages/MergeTree/MergeTask.cpp Outdated

clickhouse-gh Bot reviewed Jun 17, 2026

View reviewed changes

clickhouse-gh Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread src/Storages/MergeTree/MutateTask.cpp Outdated

clickhouse-gh Bot reviewed Jun 27, 2026

View reviewed changes

clickhouse-gh Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread src/Storages/MergeTree/MergeTask.cpp

Comment thread src/DataTypes/Serializations/SerializationInfo.cpp Outdated

clickhouse-gh Bot reviewed Jun 30, 2026

View reviewed changes

This was referenced Jun 30, 2026

Fix CREATE OR REPLACE MATERIALIZED VIEW ... POPULATE losing the source subscription #108728

Open

Fix exception when querying a Hive table without a WHERE clause #108094

Open

amosbird force-pushed the skip-empty-columns branch from 3e5529b to 0aa7d4a Compare July 3, 2026 04:39

amosbird force-pushed the skip-empty-columns branch from 0aa7d4a to fc4513a Compare July 3, 2026 11:33

	{"skip_empty_columns_on_insert", false, false, "New setting to skip writing all-default columns on INSERT"},
	{"skip_empty_columns_on_insert", false, false, "New setting to skip writing all type default columns on INSERT"},

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

amosbird commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Documentation entry for user-facing changes

Uh oh!

clickhouse-gh Bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

amosbird commented Mar 20, 2026

Uh oh!

Fgrtue commented Mar 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Fgrtue left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

amosbird commented Mar 25, 2026

Uh oh!

Fgrtue commented Mar 25, 2026

Uh oh!

amosbird commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fgrtue commented Mar 25, 2026

Uh oh!

amosbird commented Mar 25, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Fgrtue left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amosbird commented Mar 30, 2026

Uh oh!

Uh oh!

alexey-milovidov commented Apr 7, 2026

Uh oh!

alexey-milovidov commented Apr 9, 2026

Uh oh!

CurtizJ left a comment

Choose a reason for hiding this comment

amosbird commented Mar 2, 2026 •

edited

Loading

clickhouse-gh Bot commented Mar 2, 2026 •

edited

Loading

amosbird commented Mar 25, 2026 •

edited

Loading