Remove redundant checksum calculation during vertical merge of parts with projections#92866
Conversation
tiandiwonder
left a comment
There was a problem hiding this comment.
fix test failure: 01710_projection_fetch_long
…-in-projection-1 # Conflicts: # src/Storages/MergeTree/MergedColumnOnlyOutputStream.cpp
The previous commit removed the projection-checksums loop from `MergedColumnOnlyOutputStream::fillChecksums`. That removal is correct for the vertical merge path (where `MergedBlockOutputStream::finalizePartAsync` re-adds projection checksums later), but for `MutateSomePartColumnsTask` there is no subsequent step that adds them, so the resulting `checksums.txt` was missing `*.proj` entries for projections rebuilt during a mutation (e.g. `MATERIALIZE PROJECTION`). This caused stateless test `01710_projection_fetch_long` to fail because the replicated part's checksums no longer contained the projection entry after `materialize projection`, and replicas observed inconsistent state. Restore the loop in the mutation finalize path so that projection checksums end up in `new_data_part->checksums` before `finalizeMutatedPart` writes them to disk. CI: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=92866&sha=0e81c92c1f24f056d3672f0e1f34603dffb55596&name_0=PR PR: ClickHouse#92866
|
@groeneai, provide a fix for |
|
@alexey-milovidov @amosbird — Cross-PR data (CIDB, last 30 days)The same test fails on 21 distinct PRs plus 3 master commits in just 2 days:
This PR (#92866) modifies only Root causeThe test was added in #104313 to cover #104307. It drives the bug condition through Fix is already in flightPR #104418 ("Fix flaky FileCacheTest.SLRUFreeSpaceKeepingProtectedOnly", open since 2026-05-08) replaces the racy integration test with a focused unit-level test that calls @kssenii was already cc'd as the filesystem-cache owner. PR is TL;DR for #92866Re-running CI report for the failure on this PR: (session: cron:clickhouse-ci-task-worker:20260510-004500) |
|
CI status note for the two non-success checks on commit
|
|
Merged master into the branch to re-run CI. The two failures look unrelated to this PR:
The PR itself only touches projection-checksum bookkeeping in vertical merge / mutation, with no parser/formatter changes — it cannot affect AST roundtrip behavior. |
Workflow files in this branch were stale or had been deleted as a workaround for GitHub's workflow-scope check. Restoring them to match `master` so the PR diff no longer contains those unrelated changes.
|
Merged the latest |
LLVM Coverage Report
Changed lines: Changed C/C++ lines covered by tests: 14/16 (87.50%) | Lost baseline coverage (was covered on master, now uncovered in this PR): 1 line(s) · Uncovered code |

Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
.
Documentation entry for user-facing changes
Note
Medium Risk
Touches MergeTree part checksum generation around projections, which is integrity-critical; a missed or duplicated checksum could surface as corrupted parts during mutation/merge workflows.
Overview
Removes redundant projection checksum bookkeeping during column-only writes.
MergedColumnOnlyOutputStream::fillChecksumsno longer adds projection-part entries to the part checksums, avoiding duplicate work during vertical merges where projections are accounted for later.Preserves mutation correctness by adding projection checksums at mutation finalize time.
MutateTasknow explicitly appends checksums for rebuilt projection parts afterfillChecksums, ensuring mutations that rebuild projections still persist the required*.projchecksum records.Reviewed by Cursor Bugbot for commit debe999. Bugbot is set up for automated code reviews on this repo. Configure here.
Version info
26.6.1.491