iframe-proxy

groeneai · 2026-04-16T00:03:04Z

The paranoid check on startup verifies that all ZK parts covered by another ZK part either exist on disk or are in the parts_to_fetch list. However, it missed a third legitimate case: a local active part already covers the ZK part.

This happens after a merge or mutation: the covering part is active locally, the covered part was cleaned from disk by clearOldPartsAndRemoveFromZK, but the covered part's ZK entry has not been cleaned up yet (ZK cleanup can lag behind disk cleanup, or the server was restarted between the two). The data is preserved in the local covering part, and the stale ZK entry will be removed by the cleanup thread after startup.

Without this fix, the chassert(false) fires as a false positive in debug/sanitizer builds during stress tests, specifically in the createLogEntriesToFetchBrokenParts() → paranoidCheckForCoveredPartsInZooKeeperOnStart() call path.

CI evidence (STID: 2508-5dc3, 2508-6644):

14+ hits in 30 days across TSAN/MSAN/UBSAN stress tests on master and unrelated PRs
Always the same stack: createLogEntriesToFetchBrokenParts → paranoidCheckForCoveredPartsInZooKeeperOnStart → chassert(false) at line 1839
Prior fixes addressed other false-positive scenarios in this function:
- PR Fix race between INSERT and DROP_RANGE causing LOGICAL_ERROR in paranoid check of covered parts. #96164: Fixed race between INSERT and DROP_RANGE
- PR Fix race between async outdated parts loading and paranoid check on startup #96705: Skipped check when outdated parts are still loading asynchronously
- PR Fix false-positive assertion in paranoid check for covered parts on startup #100652: Fixed empty parts_to_fetch passed from createLogEntriesToFetchBrokenParts
None covered the case where a local active part covers the ZK part

Fix: Before asserting, check if getActiveContainingPart(part_name) returns a non-null result. If a local active part covers the ZK part, the data is safe and the assertion should not fire. The assertion is preserved for genuine data loss scenarios (no local covering part, no disk copy, not being fetched).

Changelog category (leave one):

Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix a false-positive Logical error: 'false' assertion in paranoidCheckForCoveredPartsInZooKeeperOnStart that could crash the server during ReplicatedMergeTree startup in debug/sanitizer builds. The check now correctly recognizes that a covered ZK part whose data is preserved in a local active covering part is a legitimate state, not a data loss scenario.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

…rOnStart The paranoid check on startup verifies that all ZK parts covered by another ZK part either exist on disk or are in the parts_to_fetch list. However, it does not check whether a local active part already covers the ZK part. This legitimately happens after a merge or mutation: the covering part is active locally, the covered part was cleaned from disk by clearOldPartsAndRemoveFromZK, but the covered part's ZK entry has not been cleaned up yet (ZK cleanup can lag behind disk cleanup, or the server was restarted between the two). The data is preserved in the local covering part, and the stale ZK entry will be removed by the cleanup thread after startup. Without this fix, the chassert(false) fires as a false positive in debug/sanitizer builds during stress tests, particularly in the createLogEntriesToFetchBrokenParts() call path (STID: 2508-5dc3, 2508-6644). Prior fixes PR ClickHouse#96164, ClickHouse#96705, and ClickHouse#100652 addressed other false-positive scenarios in this function but missed the case where a local active part covers the ZK part. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

groeneai · 2026-04-16T00:03:25Z

Pre-PR Validation Gate (session: cron:clickhouse-ci-task-worker:20260415-231500)

a) Deterministic repro?
This is a startup race condition in ReplicatedMergeTree that depends on specific ZK state (covered parts with stale entries). It fires sporadically during stress tests (14+ hits in 30 days) and cannot be triggered deterministically from a test script — the same limitation that applied to prior fixes PR #96164, #96705, and #100652. The "reproduction" is the 14+ CI hits across unrelated PRs and master confirming this is a real, persistent trunk bug.

b) Root cause explained?
During startup, createLogEntriesToFetchBrokenParts() calls paranoidCheckForCoveredPartsInZooKeeperOnStart(). The function checks for ZK parts that are covered by another ZK part but don't exist on disk and aren't in parts_to_fetch. However, there's a legitimate case: after a merge creates covering part C, the covered part P is cleaned from disk by clearOldPartsAndRemoveFromZK, but P's ZK entry persists (ZK cleanup lags behind disk cleanup or server was restarted between the two). P is covered locally by C, so getActiveContainingPart(P) returns C, meaning P is NOT in parts_to_fetch. But P also doesn't exist on disk. The function doesn't check for local active covering parts, so chassert(false) fires — a false positive, since the data IS safe in C.

c) Fix matches root cause?
Yes. Adding getActiveContainingPart(part_name) != nullptr check directly addresses the missing case. The assertion is preserved for genuine data loss (no local covering part, no disk copy, not being fetched).

d) Test intent preserved?
Yes. The paranoid check still catches genuine scenarios where a covered ZK part has no local copy anywhere. The fix only suppresses the assertion when the data IS provably safe (covered by a local active part).

e) Both directions demonstrated?
For startup race conditions, the "without fix" direction is demonstrated by 14+ CI crashes in 30 days. The "with fix" direction will be validated by CI stress tests on this PR. The code analysis was verified by an agent that confirmed: const-safety, no deadlock risk, correct scope (non-startup version doesn't need fixing).

f) Fix is general, not a narrow patch?
Yes. The fix is inside paranoidCheckForCoveredPartsInZooKeeperOnStart itself, so it covers ALL call sites (both checkPartsImpl and createLogEntriesToFetchBrokenParts). The non-startup version (paranoidCheckForCoveredPartsInZooKeeper, static function at line ~2757) was verified to NOT need this fix — it only queries ZK and never accesses local data_parts.

groeneai · 2026-04-16T00:03:48Z

cc @tavplubix @alesapin — could you review this? It fixes a false-positive chassert(false) in paranoidCheckForCoveredPartsInZooKeeperOnStart that crashes debug/sanitizer builds during stress tests (14+ hits in 30 days, STID 2508-5dc3 and 2508-6644). The fix adds a getActiveContainingPart() check so the assertion doesn't fire when a local active part already covers the stale ZK entry — a legitimate post-merge state that prior fixes (#96164, #96705, #100652) didn't handle.

clickhouse-gh · 2026-04-16T09:21:38Z

Workflow [PR], commit [221f150]

Summary: ❌

job_name	test_name	status	info	comment
Finish Workflow		FAIL
	python3 ./ci/jobs/scripts/workflow_hooks/new_tests_check.py	FAIL

AI Review

Summary

This PR fixes a debug/sanitizer-only false-positive assertion in paranoidCheckForCoveredPartsInZooKeeperOnStart by treating a locally present covering ZooKeeper part as a valid state when the covered part entry is stale in ZooKeeper. The change is targeted, keeps the assertion for genuine missing-data scenarios, and looks correct.

ClickHouse Rules

Item	Status	Notes
Deletion logging	➖
Serialization versioning	➖
Core-area scrutiny	✅
No test removal	✅
Experimental gate	➖
No magic constants	✅
Backward compatibility	✅
`SettingsChangesHistory.cpp`	➖
PR metadata quality	✅
Safe rollout	✅
Compilation time	✅
No large/binary files	✅

Final Verdict

Status: ✅ Approve

tavplubix · 2026-04-17T14:05:09Z

The data is preserved in the local covering part

@groeneai, this is not true. Since the local part is not in zk, this part is going to be removed as unexpected

alexey-milovidov · 2026-04-25T15:45:34Z

@groeneai, continue on this PR according to the reviewer feedback.

@tavplubix

Per @tavplubix's feedback, replace the `getActiveContainingPart()` check with a direct disk-existence check on `covering_part`. The covering part is a ZooKeeper part by construction (it comes from `active_set` built from `parts_in_zk`), so this addresses his concern that a local-only part is treated as `unexpected` at load time and detached as `ignored` later in `checkPartsImpl`. Why the previous check was nonetheless safe: `getActiveContainingPart` returns from `data_parts_by_state_and_info`, which only contains parts loaded as `expected` (in ZooKeeper at load time, per `StorageReplicatedMergeTree.cpp:507-549,561` populating `expected_parts_on_this_replica` and `MergeTreeData.cpp:2330-2333` splitting parts on disk into expected vs unexpected at load). Parts on disk not in ZK at load go to a separate `unexpected_data_parts` vector and are detached as `ignored` (`StorageReplicatedMergeTree.cpp:2047`); they never enter `data_parts_by_state_and_info` and thus are not returned by `getActiveContainingPart`. So the original fix could not be fooled by an unexpected part. The new check is nonetheless cleaner and more direct: if the ZK covering part is on disk, the covered part's data is provably preserved (covering parts contain all data of the parts they cover by definition). ClickHouse#102864

groeneai · 2026-04-25T16:32:54Z

@tavplubix thanks for the pushback — you're right that a local part not in ZooKeeper would be detached as unexpected. Re-reading the path I had to convince myself that wasn't what getActiveContainingPart could return; it isn't, but the dependency was implicit. I've pushed 221f150 which avoids the question entirely by checking the ZK covering part on disk directly, instead of any local-only part.

Why the original `getActiveContainingPart` check was nonetheless safe

getActiveContainingPart searches data_parts_by_state_and_info (MergeTreeData.cpp:5914-5938) — that index is populated only from parts loaded as expected. Parts on disk that are not in expected_parts_on_this_replica (built from replica_path/parts in StorageReplicatedMergeTree.cpp:507-549) are routed to a separate unexpected_data_parts vector at MergeTreeData.cpp:2330-2333, loaded asynchronously by loadUnexpectedDataParts, and detached as ignored later in checkPartsImpl (StorageReplicatedMergeTree.cpp:2047). They never enter data_parts_by_state_and_info, so getActiveContainingPart cannot return one. The "removed as unexpected" failure mode you described therefore couldn't apply to a part returned by that call.

Why I switched anyway

That argument is valid but indirect — it depends on a non-obvious invariant (load-time classification, separate collections, async unexpected loading, detach in checkPartsImpl). The new check is direct: covering_part is a ZooKeeper part by construction (it comes from active_set built from parts_in_zk), so checking whether covering_part exists on disk is exactly the question "is the data preserved in the covering part?" — no reasoning about local-only parts needed.

Updated check (`StorageReplicatedMergeTree.cpp:1829-1853`)

/// Check if the ZooKeeper covering part itself exists on disk locally.
/// If it does, the covered part's data is preserved in the covering part:
/// a merge or mutation produced `covering_part`, then `part_name` was cleaned
/// from disk by `clearOldPartsAndRemoveFromZK`, but its ZK entry has not
/// been cleaned up yet (ZK cleanup can lag behind disk cleanup, or the
/// server was restarted between the two). The stale ZK entry will be
/// removed by the cleanup thread after startup.
if (!found)
{
    for (const DiskPtr & disk : disks)
    {
        if (disk->existsDirectory(fs::path(path) / covering_part))
        {
            found = true;
            break;
        }
    }
}

Scenario walk-through

After a merge in clearOldPartsAndRemoveFromZK:

Part	In ZK?	On disk?
`covering_part` (e.g. `all_0_2_1`)	yes	yes (active local)
`part_name` (e.g. `all_0_0_0`)	yes — stale entry, cleanup thread hasn't run	no — cleaned
`part_name` (e.g. `all_1_1_0`)	yes — stale entry	no — cleaned

Old logic: part_name not on disk, not in parts_to_fetch (because getActiveContainingPart(part_name) returns the local copy of covering_part, so it's excluded from the fetch list at the call site) → chassert(false) fires.

New logic: same as above, then we additionally check whether covering_part exists on disk. It does, because the merge result is still there — so we take the legitimate-state branch, log a WARNING, and skip the chassert.

True data loss path is unchanged: if the covering part is genuinely missing on disk, found stays false and the chassert fires.

Pre-PR validation gate (session: cron:clickhouse-ci-task-worker:20260425-161500)

a) Deterministic repro? Same startup race as before (14+ CI hits in 30 days, STIDs 2508-5dc3 / 2508-6644). No new local repro for this iteration; the change is a strict tightening of the guard.
b) Root cause explained? Yes — the chassert did not consider the case where the ZK covering part exists on disk. parts_to_fetch cannot reach it because the call site excludes covered parts whose covering part is locally present (see ReplicatedMergeTreeQueue.cpp:243-246 and StorageReplicatedMergeTree.cpp:1888-1891).
c) Fix matches root cause? Yes — directly checks the missing case using the ZK covering part name.
d) Test intent preserved? Yes — the assertion still fires for genuine data loss (no part_name on disk, no covering_part on disk, not in parts_to_fetch).
e) Both directions demonstrated? Without the fix: 14+ CI master hits over 30 days. With the fix: CI on this PR.
f) Fix is general, not a narrow patch? Yes — fix is inside paranoidCheckForCoveredPartsInZooKeeperOnStart itself, applying to both call sites (checkPartsImpl and createLogEntriesToFetchBrokenParts).

clickhouse-gh · 2026-04-25T19:29:30Z

LLVM Coverage Report

Metric	Baseline	Current	Δ
Lines	83.70%	83.90%	+0.20%
Functions	91.10%	91.10%	+0.00%
Branches	76.20%	76.30%	+0.10%

Changed lines: 16.67% (3/18) | lost baseline coverage: 26 line(s) · Uncovered code

Full report · Diff report

groeneai · 2026-05-12T03:30:38Z

nikitamikhaylov added the can be tested Allows running workflows for external contributors label Apr 16, 2026

clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Apr 16, 2026

groeneai mentioned this pull request Apr 16, 2026

CI: add experimental serverfuzz stress test and BuzzHouse jobs #101399

Merged

groeneai mentioned this pull request May 10, 2026

Fix false-positive covered part startup check #104516

Merged

1 task

groeneai closed this May 12, 2026

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix false-positive chassert in paranoidCheckForCoveredPartsInZooKeeperOnStart#102864

Fix false-positive chassert in paranoidCheckForCoveredPartsInZooKeeperOnStart#102864
groeneai wants to merge 2 commits into
ClickHouse:masterfrom
groeneai:fix-paranoid-check-covered-parts

groeneai commented Apr 16, 2026

Uh oh!

groeneai commented Apr 16, 2026

Uh oh!

groeneai commented Apr 16, 2026

Uh oh!

clickhouse-gh Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

tavplubix commented Apr 17, 2026

Uh oh!

alexey-milovidov commented Apr 25, 2026

Uh oh!

groeneai commented Apr 25, 2026

Uh oh!

clickhouse-gh Bot commented Apr 25, 2026

Uh oh!

groeneai commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

groeneai commented Apr 16, 2026

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Documentation entry for user-facing changes

Uh oh!

groeneai commented Apr 16, 2026

Pre-PR Validation Gate (session: cron:clickhouse-ci-task-worker:20260415-231500)

Uh oh!

groeneai commented Apr 16, 2026

Uh oh!

clickhouse-gh Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Review

Summary

ClickHouse Rules

Final Verdict

Uh oh!

tavplubix commented Apr 17, 2026

Uh oh!

alexey-milovidov commented Apr 25, 2026

Uh oh!

groeneai commented Apr 25, 2026

Why the original getActiveContainingPart check was nonetheless safe

Why I switched anyway

Updated check (StorageReplicatedMergeTree.cpp:1829-1853)

Scenario walk-through

Pre-PR validation gate (session: cron:clickhouse-ci-task-worker:20260425-161500)

Uh oh!

clickhouse-gh Bot commented Apr 25, 2026

LLVM Coverage Report

Uh oh!

groeneai commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

clickhouse-gh Bot commented Apr 16, 2026 •

edited

Loading

Why the original `getActiveContainingPart` check was nonetheless safe

Updated check (`StorageReplicatedMergeTree.cpp:1829-1853`)