Fix logical error 'current_writer != nullptr' by vitlibar · Pull Request #102215 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix logical error 'current_writer != nullptr'#102215

Draft
vitlibar wants to merge 12 commits into
ClickHouse:masterfrom
vitlibar:fix-logical-error-current-writer-null
Draft

Fix logical error 'current_writer != nullptr'#102215
vitlibar wants to merge 12 commits into
ClickHouse:masterfrom
vitlibar:fix-logical-error-current-writer-null

Conversation

@vitlibar

@vitlibar vitlibar commented Apr 9, 2026

Copy link
Copy Markdown
Member

Replace the over-strict chassert(current_writer != nullptr) in MergeTreeDeduplicationLog with lazy initialization of the log writer (initCurrentWriter), so the writer is opened on demand from load, addPart, and dropPart. The write mode (append or rewrite) is chosen from whether the disk's metadata storage supports appending, which keeps the deduplication log working on storage policies that mix disks with different metadata types (e.g. a plain and a local object-storage metadata disk, as in the linked issue).

This is defensive hardening rather than a stable-release regression fix: the current_writer != nullptr logical error from the linked issue is no longer reproducible on current master (the existing MetadataStorageType::Plain handling in load already opens the writer), so there is no user-visible misbehavior in a stable release to announce — hence the Not for changelog category. The change removes the remaining code path through which the assertion could be reached and adds a regression test (test_merge_tree_deduplication_mixed_metadata) covering the storage-policy shape from the issue.

Closes: #86189

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Not for changelog (defensive hardening: the logical error is already unreachable on current master).

@vitlibar vitlibar added the 🍃 green ci 🌿 Fixing flaky tests in CI label Apr 9, 2026
@clickhouse-gh

clickhouse-gh Bot commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-not-for-changelog This PR should not be mentioned in the changelog label Apr 9, 2026
@vitlibar vitlibar requested a review from tuanpach April 9, 2026 11:54
Comment thread src/Storages/MergeTree/MergeTreeDeduplicationLog.cpp Outdated
Comment thread src/Storages/MergeTree/MergeTreeDeduplicationLog.cpp Outdated
@vitlibar vitlibar marked this pull request as draft April 9, 2026 12:26
@vitlibar vitlibar force-pushed the fix-logical-error-current-writer-null branch from 6def6f6 to 40d854c Compare April 9, 2026 12:36
@vitlibar vitlibar requested a review from Copilot April 9, 2026 12:37
Comment thread src/Storages/MergeTree/MergeTreeDeduplicationLog.h Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an assertion-triggering logical error in MergeTreeDeduplicationLog by ensuring the current log writer is initialized before writing, addressing the crash reported in #86189 for certain object-storage disk configurations.

Changes:

  • Introduce initCurrentWriter() to lazily open the deduplication log writer when needed.
  • Replace chassert(current_writer != nullptr) in addPart()/dropPart() with writer initialization.
  • Refactor writer finalization/cancellation handling in rotate() and reuse the new initializer from load()/rotate().

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/Storages/MergeTree/MergeTreeDeduplicationLog.h Declares initCurrentWriter() to centralize writer initialization.
src/Storages/MergeTree/MergeTreeDeduplicationLog.cpp Implements initCurrentWriter() and uses it to prevent null-writer assertions during INSERT/dedup log updates.

Comment thread src/Storages/MergeTree/MergeTreeDeduplicationLog.h Outdated
Comment thread src/Storages/MergeTree/MergeTreeDeduplicationLog.cpp Outdated
Comment thread src/Storages/MergeTree/MergeTreeDeduplicationLog.cpp
@alexey-milovidov

Copy link
Copy Markdown
Member

The test_database_iceberg_nessie_catalog and test_database_iceberg_lakekeeper_catalog failures in this PR's CI are false positives caused by a bug in PR #100583 (merged to master on 2026-04-09 at 11:19 UTC). That PR broke the metadata file path resolution in getMetadataLocation for Iceberg catalog tests — it produced a doubled S3 key like table_path/warehouse-rest/table_path/metadata/... instead of table_path/metadata/..., causing MinIO to return 403.

The bug was reverted by PR #102247 (merged at ~14:48 UTC). All nessie/lakekeeper test failures between these two merges are unrelated to this PR.

Comment thread src/Storages/MergeTree/MergeTreeDeduplicationLog.cpp
alexey-milovidov and others added 3 commits June 3, 2026 20:41
…CAL_ERROR message

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…licy

Reproduces the storage policy shape from issue ClickHouse#86189: the deduplication log
lives on the first disk of the policy (`metadata_type=plain`, which does not
support writing with append), while data parts are routed to a second
`local`-metadata disk. Verifies that `INSERT` with deduplication works.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Picked up this PR and pushed updates (b82c98c6432..11b4ee7484a):

While validating the test I confirmed (by building master's version of MergeTreeDeduplicationLog) that the historical current_writer != nullptr assertion is no longer reachable through this configuration on current master — the existing MetadataStorageType::Plain handling in load already opens the writer. So this PR is effectively defensive hardening (lazy initialization replacing an over-strict chassert), and the new test passes on both pre- and post-fix binaries. Details are in the resolved review threads.

Built on ARM and ran the new integration test — it passes.

@alexey-milovidov

Copy link
Copy Markdown
Member

Merged master again (11b4ee7484a..1fe71fcc2bd) — the branch was 731 commits / 3 days behind and the Report messages status was red. The merge was conflict-free and did not touch the PR's own files.

Verified MergeTreeDeduplicationLog.cpp still compiles cleanly against current master (no API drift over the merged commits). The diff is unchanged: lazy initCurrentWriter (replacing the over-strict chassert(current_writer != nullptr)) with append/rewrite mode chosen from disk_supports_writing_with_append, plus the test_merge_tree_deduplication_mixed_metadata integration test.

No unresolved review threads. This refreshes CI to clear the stale advisory status.

@alexey-milovidov

Copy link
Copy Markdown
Member

Merged master to refresh CI (branch was ~1 week behind the previous merge-base ab83938b7eb from 06-06, 2377 commits). The merge is clean and the PR diff is unchanged — only MergeTreeDeduplicationLog.cpp/.h and the integration test. New head: 96f52fa1b9b.

alexey-milovidov and others added 2 commits June 17, 2026 06:56
…urrent-writer-null

Bring the branch up to date with master to clear unrelated red CI:
the 'Fast test (arm_darwin)' failures in 01951_distributed_push_down_limit
and 01952_optimize_distributed_group_by_sharding_key were a distributed
query-plan regression on the old merge-base, fixed on master by the remote-plan
rework in ClusterProxy/executeQuery.cpp. Master did not touch the PR's files.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Reworked the PR description to resolve the AI Review's Request changes verdict, which flagged an internal inconsistency: the body presented this as a user-visible bug fix while the category was Not for changelog, and the new test passes on both pre- and post-fix binaries on current master.

The description now tells one consistent story matching the evidence: this is defensive hardening, not a stable-release regression fix. The current_writer != nullptr logical error from #86189 is no longer reachable on current master (the existing MetadataStorageType::Plain handling in load already opens the writer); this PR removes the remaining path to the over-strict chassert by lazily opening the writer via initCurrentWriter (append/rewrite mode chosen from disk_supports_writing_with_append) and adds the test_merge_tree_deduplication_mixed_metadata regression test for the storage-policy shape from the issue. Category stays Not for changelog for that reason.

Also merged master (f9d85066352..50357f557ca) — the branch was 708 commits behind and red. The merge is conflict-free and does not touch the PR's own files; the net diff against master is still exactly the two MergeTreeDeduplicationLog files and the integration test. This refreshes CI: the two failures on the previous run were unrelated — Stress test (arm_msan) hit Logical error: SortingAggregatedTransform already got bucket with number in MergingAggregatedMemoryEfficientTransform (distributed memory-efficient aggregation, not touched here), and BuzzHouse (arm_asan_ubsan) reported a generic ERROR.

@alexey-milovidov

Copy link
Copy Markdown
Member

Merged master again (50357f557ca..8fa43a81271) — the branch was 553 commits behind (merge-base 426e71331de from 2026-06-19) and red. The merge was conflict-free and does not touch the PR's own files; master had not modified MergeTreeDeduplicationLog.cpp/.h since the merge-base, so the net diff against master is unchanged — still exactly the two MergeTreeDeduplicationLog files (lazy initCurrentWriter replacing the over-strict chassert(current_writer != nullptr), with append/rewrite mode chosen from disk_supports_writing_with_append) plus the test_merge_tree_deduplication_mixed_metadata integration test. This refreshes CI.

All four failures on the previous run were unrelated, widespread flakes (each reproduces on master itself, i.e. with pull_request_number = 0, and across many other PRs):

No unresolved review threads; the AI Review Request changes verdict was already resolved by the earlier description rework. The change remains defensive hardening (Not for changelog).

@alexey-milovidov

Copy link
Copy Markdown
Member

Merged master again (8fa43a81271..dcb93e7f40f). The CI workflow had not triggered on the previous head 8fa43a81271 (pushed 2026-06-21) — no checks ran on it for over a day — so this merge both refreshes the branch (it was 624 commits behind) and re-triggers CI.

The merge is conflict-free and does not touch the PR's own files; master has not modified MergeTreeDeduplicationLog.cpp/.h since the merge-base, so the net diff against master is unchanged — still exactly the two MergeTreeDeduplicationLog files (lazy initCurrentWriter) plus the test_merge_tree_deduplication_mixed_metadata integration test.

@clickhouse-gh

clickhouse-gh Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.40% 85.40% +0.00%
Functions 92.60% 92.60% +0.00%
Branches 77.60% 77.60% +0.00%

Changed lines: Changed C/C++ lines covered by tests: 25/33 (75.76%) | Lost baseline coverage: none · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🍃 green ci 🌿 Fixing flaky tests in CI pr-not-for-changelog This PR should not be mentioned in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Logical error: 'current_writer != nullptr'

3 participants