Fix false-positive chassert in paranoidCheckForCoveredPartsInZooKeeperOnStart by groeneai · Pull Request #102864 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix false-positive chassert in paranoidCheckForCoveredPartsInZooKeeperOnStart#102864

Closed
groeneai wants to merge 2 commits into
ClickHouse:masterfrom
groeneai:fix-paranoid-check-covered-parts
Closed

Fix false-positive chassert in paranoidCheckForCoveredPartsInZooKeeperOnStart#102864
groeneai wants to merge 2 commits into
ClickHouse:masterfrom
groeneai:fix-paranoid-check-covered-parts

Conversation

@groeneai

Copy link
Copy Markdown
Contributor

The paranoid check on startup verifies that all ZK parts covered by another ZK part either exist on disk or are in the parts_to_fetch list. However, it missed a third legitimate case: a local active part already covers the ZK part.

This happens after a merge or mutation: the covering part is active locally, the covered part was cleaned from disk by clearOldPartsAndRemoveFromZK, but the covered part's ZK entry has not been cleaned up yet (ZK cleanup can lag behind disk cleanup, or the server was restarted between the two). The data is preserved in the local covering part, and the stale ZK entry will be removed by the cleanup thread after startup.

Without this fix, the chassert(false) fires as a false positive in debug/sanitizer builds during stress tests, specifically in the createLogEntriesToFetchBrokenParts()paranoidCheckForCoveredPartsInZooKeeperOnStart() call path.

CI evidence (STID: 2508-5dc3, 2508-6644):

Fix: Before asserting, check if getActiveContainingPart(part_name) returns a non-null result. If a local active part covers the ZK part, the data is safe and the assertion should not fire. The assertion is preserved for genuine data loss scenarios (no local covering part, no disk copy, not being fetched).

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix a false-positive Logical error: 'false' assertion in paranoidCheckForCoveredPartsInZooKeeperOnStart that could crash the server during ReplicatedMergeTree startup in debug/sanitizer builds. The check now correctly recognizes that a covered ZK part whose data is preserved in a local active covering part is a legitimate state, not a data loss scenario.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

…rOnStart

The paranoid check on startup verifies that all ZK parts covered by another
ZK part either exist on disk or are in the parts_to_fetch list. However, it
does not check whether a local active part already covers the ZK part.

This legitimately happens after a merge or mutation: the covering part is
active locally, the covered part was cleaned from disk by
clearOldPartsAndRemoveFromZK, but the covered part's ZK entry has not been
cleaned up yet (ZK cleanup can lag behind disk cleanup, or the server was
restarted between the two). The data is preserved in the local covering
part, and the stale ZK entry will be removed by the cleanup thread after
startup.

Without this fix, the chassert(false) fires as a false positive in
debug/sanitizer builds during stress tests, particularly in the
createLogEntriesToFetchBrokenParts() call path (STID: 2508-5dc3,
2508-6644). Prior fixes PR ClickHouse#96164, ClickHouse#96705, and ClickHouse#100652 addressed other
false-positive scenarios in this function but missed the case where a local
active part covers the ZK part.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@groeneai

Copy link
Copy Markdown
Contributor Author

@groeneai

Copy link
Copy Markdown
Contributor Author

cc @tavplubix @alesapin — could you review this? It fixes a false-positive chassert(false) in paranoidCheckForCoveredPartsInZooKeeperOnStart that crashes debug/sanitizer builds during stress tests (14+ hits in 30 days, STID 2508-5dc3 and 2508-6644). The fix adds a getActiveContainingPart() check so the assertion doesn't fire when a local active part already covers the stale ZK entry — a legitimate post-merge state that prior fixes (#96164, #96705, #100652) didn't handle.

@nikitamikhaylov nikitamikhaylov added the can be tested Allows running workflows for external contributors label Apr 16, 2026
@clickhouse-gh

clickhouse-gh Bot commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [221f150]

Summary:

job_name test_name status info comment
Finish Workflow FAIL
python3 ./ci/jobs/scripts/workflow_hooks/new_tests_check.py FAIL

AI Review

Summary

This PR fixes a debug/sanitizer-only false-positive assertion in paranoidCheckForCoveredPartsInZooKeeperOnStart by treating a locally present covering ZooKeeper part as a valid state when the covered part entry is stale in ZooKeeper. The change is targeted, keeps the assertion for genuine missing-data scenarios, and looks correct.

ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
No large/binary files
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Apr 16, 2026
@tavplubix

Copy link
Copy Markdown
Member

The data is preserved in the local covering part

@groeneai, this is not true. Since the local part is not in zk, this part is going to be removed as unexpected

@alexey-milovidov

Copy link
Copy Markdown
Member

@groeneai, continue on this PR according to the reviewer feedback.

Per @tavplubix's feedback, replace the `getActiveContainingPart()` check
with a direct disk-existence check on `covering_part`. The covering part
is a ZooKeeper part by construction (it comes from `active_set` built
from `parts_in_zk`), so this addresses his concern that a local-only
part is treated as `unexpected` at load time and detached as `ignored`
later in `checkPartsImpl`.

Why the previous check was nonetheless safe: `getActiveContainingPart`
returns from `data_parts_by_state_and_info`, which only contains parts
loaded as `expected` (in ZooKeeper at load time, per
`StorageReplicatedMergeTree.cpp:507-549,561` populating
`expected_parts_on_this_replica` and `MergeTreeData.cpp:2330-2333`
splitting parts on disk into expected vs unexpected at load). Parts on
disk not in ZK at load go to a separate `unexpected_data_parts` vector
and are detached as `ignored` (`StorageReplicatedMergeTree.cpp:2047`);
they never enter `data_parts_by_state_and_info` and thus are not
returned by `getActiveContainingPart`. So the original fix could not be
fooled by an unexpected part.

The new check is nonetheless cleaner and more direct: if the ZK covering
part is on disk, the covered part's data is provably preserved (covering
parts contain all data of the parts they cover by definition).

ClickHouse#102864
@groeneai

Copy link
Copy Markdown
Contributor Author

@tavplubix thanks for the pushback — you're right that a local part not in ZooKeeper would be detached as unexpected. Re-reading the path I had to convince myself that wasn't what getActiveContainingPart could return; it isn't, but the dependency was implicit. I've pushed 221f150 which avoids the question entirely by checking the ZK covering part on disk directly, instead of any local-only part.

Why the original getActiveContainingPart check was nonetheless safe

getActiveContainingPart searches data_parts_by_state_and_info (MergeTreeData.cpp:5914-5938) — that index is populated only from parts loaded as expected. Parts on disk that are not in expected_parts_on_this_replica (built from replica_path/parts in StorageReplicatedMergeTree.cpp:507-549) are routed to a separate unexpected_data_parts vector at MergeTreeData.cpp:2330-2333, loaded asynchronously by loadUnexpectedDataParts, and detached as ignored later in checkPartsImpl (StorageReplicatedMergeTree.cpp:2047). They never enter data_parts_by_state_and_info, so getActiveContainingPart cannot return one. The "removed as unexpected" failure mode you described therefore couldn't apply to a part returned by that call.

Why I switched anyway

That argument is valid but indirect — it depends on a non-obvious invariant (load-time classification, separate collections, async unexpected loading, detach in checkPartsImpl). The new check is direct: covering_part is a ZooKeeper part by construction (it comes from active_set built from parts_in_zk), so checking whether covering_part exists on disk is exactly the question "is the data preserved in the covering part?" — no reasoning about local-only parts needed.

Updated check (StorageReplicatedMergeTree.cpp:1829-1853)

/// Check if the ZooKeeper covering part itself exists on disk locally.
/// If it does, the covered part's data is preserved in the covering part:
/// a merge or mutation produced `covering_part`, then `part_name` was cleaned
/// from disk by `clearOldPartsAndRemoveFromZK`, but its ZK entry has not
/// been cleaned up yet (ZK cleanup can lag behind disk cleanup, or the
/// server was restarted between the two). The stale ZK entry will be
/// removed by the cleanup thread after startup.
if (!found)
{
    for (const DiskPtr & disk : disks)
    {
        if (disk->existsDirectory(fs::path(path) / covering_part))
        {
            found = true;
            break;
        }
    }
}

Scenario walk-through

After a merge in clearOldPartsAndRemoveFromZK:

Part In ZK? On disk?
covering_part (e.g. all_0_2_1) yes yes (active local)
part_name (e.g. all_0_0_0) yes — stale entry, cleanup thread hasn't run no — cleaned
part_name (e.g. all_1_1_0) yes — stale entry no — cleaned

Old logic: part_name not on disk, not in parts_to_fetch (because getActiveContainingPart(part_name) returns the local copy of covering_part, so it's excluded from the fetch list at the call site) → chassert(false) fires.

New logic: same as above, then we additionally check whether covering_part exists on disk. It does, because the merge result is still there — so we take the legitimate-state branch, log a WARNING, and skip the chassert.

True data loss path is unchanged: if the covering part is genuinely missing on disk, found stays false and the chassert fires.

Pre-PR validation gate (session: cron:clickhouse-ci-task-worker:20260425-161500)

a) Deterministic repro? Same startup race as before (14+ CI hits in 30 days, STIDs 2508-5dc3 / 2508-6644). No new local repro for this iteration; the change is a strict tightening of the guard.
b) Root cause explained? Yes — the chassert did not consider the case where the ZK covering part exists on disk. parts_to_fetch cannot reach it because the call site excludes covered parts whose covering part is locally present (see ReplicatedMergeTreeQueue.cpp:243-246 and StorageReplicatedMergeTree.cpp:1888-1891).
c) Fix matches root cause? Yes — directly checks the missing case using the ZK covering part name.
d) Test intent preserved? Yes — the assertion still fires for genuine data loss (no part_name on disk, no covering_part on disk, not in parts_to_fetch).
e) Both directions demonstrated? Without the fix: 14+ CI master hits over 30 days. With the fix: CI on this PR.
f) Fix is general, not a narrow patch? Yes — fix is inside paranoidCheckForCoveredPartsInZooKeeperOnStart itself, applying to both call sites (checkPartsImpl and createLogEntriesToFetchBrokenParts).

@clickhouse-gh

clickhouse-gh Bot commented Apr 25, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 83.70% 83.90% +0.20%
Functions 91.10% 91.10% +0.00%
Branches 76.20% 76.30% +0.10%

Changed lines: 16.67% (3/18) | lost baseline coverage: 26 line(s) · Uncovered code

Full report · Diff report

@groeneai

Copy link
Copy Markdown
Contributor Author

@groeneai groeneai closed this May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants