Fix `PartLoadingTree` to correctly identify the rolled-back part when intersecting parts have transaction metadata by tuanpach · Pull Request #100992 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix PartLoadingTree to correctly identify the rolled-back part when intersecting parts have transaction metadata#100992

Open
tuanpach wants to merge 14 commits into
ClickHouse:masterfrom
tuanpach:fix-part-loading-tree-rollback-detection
Open

Fix PartLoadingTree to correctly identify the rolled-back part when intersecting parts have transaction metadata#100992
tuanpach wants to merge 14 commits into
ClickHouse:masterfrom
tuanpach:fix-part-loading-tree-rollback-detection

Conversation

@tuanpach

@tuanpach tuanpach commented Mar 28, 2026

Copy link
Copy Markdown
Member

During loadDataParts, PartLoadingTree::add builds a containment tree of on-disk parts sorted by (level, mutation) descending. When a rolled-back part already in the tree intersects an incoming committed part, the previous implementation had two bugs:

Bug 1 — wrong skip direction. The old code (has_transaction_metadata) checked whether either intersecting part has a txn_version.txt file and unconditionally skipped the incoming part. When the tree-resident part was the rolled-back one, the committed incoming part was silently discarded instead of replacing it.

Bug 2 — orphans dropped after eviction. When the rolled-back resident is evicted, its children become orphaned. They must be re-inserted via add() — together with the incoming committed part — sorted by (level, mutation) descending so that the incoming part is placed into the tree before any orphan that belongs inside it. Without this ordering an orphan sitting at sibling level causes a spurious LOGICAL_ERROR when the incoming part is re-inserted.

The LOGICAL_ERROR that caused the fatal in the stress test CI was triggered because the old helper (has_transaction_metadata) returned bool — a plain file-existence check. It could not distinguish between a definitively rolled-back part, an in-flight transaction (Tx::UnknownCSN), and a read error (e.g. an empty file read between shell truncate and write). When neither part was definitively rolled-back the code fell through to LOGICAL_ERROR, which aborts the server in sanitizer builds. The new enum class RollbackStatus with five states (RolledBack, Committed, NoMetadata, UnknownCSN, Unreadable) lets each case be handled precisely.

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fixed a server abort (in sanitizer builds) and silent data loss during table startup when a rolled-back transactional MergeTree part intersects a committed part. PartLoadingTree::add now correctly identifies which of the two intersecting parts was rolled back, evicts it, and re-parents its orphaned children under the committed part in the correct order.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@tuanpach tuanpach added the can be tested Allows running workflows for external contributors label Mar 28, 2026
@clickhouse-gh

clickhouse-gh Bot commented Mar 28, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Mar 28, 2026
Comment thread src/Storages/MergeTree/tests/gtest_part_loading_tree.cpp Outdated
@tuanpach tuanpach force-pushed the fix-part-loading-tree-rollback-detection branch from c1fb07d to f88571b Compare March 28, 2026 11:32
Comment thread src/Storages/MergeTree/tests/gtest_part_loading_tree.cpp Outdated
@tuanpach tuanpach force-pushed the fix-part-loading-tree-rollback-detection branch from 83ccc10 to d3af6b4 Compare March 30, 2026 04:01
Comment thread src/Storages/MergeTree/MergeTreeData.cpp Outdated
Comment thread src/Storages/MergeTree/MergeTreeData.cpp Outdated
@tuanpach tuanpach force-pushed the fix-part-loading-tree-rollback-detection branch from d3af6b4 to e1b5cbb Compare March 30, 2026 05:53
Comment thread src/Storages/MergeTree/MergeTreeData.cpp Outdated
Comment thread tests/queries/0_stateless/04063_part_loading_tree_rollback_reparent.sh Outdated
@tuanpach tuanpach force-pushed the fix-part-loading-tree-rollback-detection branch 2 times, most recently from 7e69815 to 664d85c Compare April 1, 2026 04:07
Comment thread src/Storages/MergeTree/MergeTreeData.h Outdated
@tuanpach tuanpach force-pushed the fix-part-loading-tree-rollback-detection branch from 1e9b03c to 2bdfe94 Compare April 2, 2026 00:43
@tuanpach tuanpach force-pushed the fix-part-loading-tree-rollback-detection branch from 2bdfe94 to 20bdf6b Compare April 2, 2026 00:55
Comment thread tests/queries/0_stateless/04063_part_loading_tree_rollback_reparent.sh Outdated
tuanpach added a commit to tuanpach/ClickHouse that referenced this pull request Apr 2, 2026
`PartLoadingTree::add` builds a containment tree of MergeTree parts during
`loadDataParts`. Parts are sorted by `(level, mutation)` descending so that
covering parts are inserted before the parts they contain. When a rolled-back
part in the tree intersects an incoming committed part two bugs were present:

**Bug 1 — wrong skip direction.**
The old code (`has_transaction_metadata`) checked whether either part has a
`txn_version.txt` file and unconditionally skipped the *incoming* part. When
the already-resident part was the rolled-back one, the committed incoming part
was silently discarded instead of replacing it.

**Bug 2 — orphans dropped after eviction.**
When the rolled-back resident is evicted, its children are orphaned. They
must be re-inserted via `add()` — together with the incoming committed part —
sorted by `(level, mutation)` descending so that the incoming part (typically
higher level) is placed into the tree *before* any orphan that belongs inside
it. Without this ordering an orphan at sibling level causes a spurious
`LOGICAL_ERROR` when the incoming part is re-inserted.

The crash observed in the stress test CI was caused by a third issue: the
old helper (`has_transaction_metadata`) returned `bool` — a plain
file-existence check. It could not distinguish between a definitively
rolled-back part, an in-flight transaction (`Tx::UnknownCSN`), and a read
error (e.g. an empty file read between shell `truncate` and `write`). When
neither part was definitively rolled-back, the code fell through to
`LOGICAL_ERROR`, which aborts the server in sanitizer builds.

CI: ClickHouse#100992

Replace the `bool`-returning helper with a 5-state `enum class RollbackStatus`
and rename the lambda to `read_txn_status`:

  `RolledBack`  — `creation_csn == Tx::RolledBackCSN` on disk
  `Committed`   — `creation_csn` is a known committed CSN on disk
  `NoMetadata`  — `txn_version.txt` absent (prehistoric/non-transactional part)
  `UnknownCSN`  — file present, `creation_csn == Tx::UnknownCSN` (in-flight)
  `Unreadable`  — file present but could not be parsed (corruption/partial write)

Intersection decision table (applied to both prev/next branches):

  `Unreadable` on either side       → `CORRUPTED_DATA` (not a code bug)
  incoming `RolledBack`             → skip incoming
  existing `RolledBack`             → evict existing; collect orphans + incoming
                                       sorted by `(level, mutation)` desc;
                                       re-add all via `add()`; `return`
  either `UnknownCSN`               → skip incoming (`loadDataPart` resolves later)
  both `NoMetadata` or `Committed`  → `LOGICAL_ERROR` (genuine bug / manual edit)

The `traverse` template body is moved inline into the header and `PartLoadingTree`
is made `public` so external tests can exercise it directly.

Add `tests/queries/0_stateless/04063_part_loading_tree_rollback_reparent.sh`.
The test manually constructs four part directories:

  `all_1_2_2_1` (1-2, L2, M1) — rolled-back parent; `txn_version.txt` holds
                                  `Tx::RolledBackCSN`; `creation_tid` uses
                                  `start_csn=2` (not `Tx::PrehistoricCSN`)
                                  so `wasInvolvedInTransaction` returns true
  `all_1_1_1_1` (1-1, L1, M1) — committed child; contained in 1-2 → inserted
                                  as a child of `all_1_2_2_1` during build
  `all_2_3_1_0` (2-3, L1, M0) — committed part; intersects `all_1_2_2_1`,
                                  triggering the eviction and re-parenting path

After `ATTACH TABLE`: `all_1_1_1_1` must be active (orphan successfully
re-parented) and `all_1_2_2_1` must not be active (rolled-back part discarded).

`txn_version.txt` is written atomically (`printf` to `.tmp` then `mv`) to
eliminate a stress-test race where the server reads the file in the window
between shell truncation and the actual write, producing an empty file and
a parse error (`Unreadable` state → `CORRUPTED_DATA` instead of server abort).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@tuanpach tuanpach force-pushed the fix-part-loading-tree-rollback-detection branch from 20bdf6b to f2a0273 Compare April 2, 2026 01:02
@alexey-milovidov

Copy link
Copy Markdown
Member

The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix.

Comment thread src/Storages/MergeTree/MergeTreeData.cpp
tuanpach and others added 2 commits May 17, 2026 04:59
`PartLoadingTree::add` builds a containment tree of MergeTree parts during
`loadDataParts`. Parts are sorted by `(level, mutation)` descending so that
covering parts are inserted before the parts they contain. When a rolled-back
part in the tree intersects an incoming committed part two bugs were present:

**Bug 1 — wrong skip direction.**
The old code (`has_transaction_metadata`) checked whether either part has a
`txn_version.txt` file and unconditionally skipped the *incoming* part. When
the already-resident part was the rolled-back one, the committed incoming part
was silently discarded instead of replacing it.

**Bug 2 — orphans dropped after eviction.**
When the rolled-back resident is evicted, its children are orphaned. They
must be re-inserted via `add()` — together with the incoming committed part —
sorted by `(level, mutation)` descending so that the incoming part (typically
higher level) is placed into the tree *before* any orphan that belongs inside
it. Without this ordering an orphan at sibling level causes a spurious
`LOGICAL_ERROR` when the incoming part is re-inserted.

The crash observed in the stress test CI was caused by a third issue: the
old helper (`has_transaction_metadata`) returned `bool` — a plain
file-existence check. It could not distinguish between a definitively
rolled-back part, an in-flight transaction (`Tx::UnknownCSN`), and a read
error (e.g. an empty file read between shell `truncate` and `write`). When
neither part was definitively rolled-back, the code fell through to
`LOGICAL_ERROR`, which aborts the server in sanitizer builds.

CI: ClickHouse#100992

Replace the `bool`-returning helper with a 5-state `enum class RollbackStatus`
and rename the lambda to `read_txn_status`:

  `RolledBack`  — `creation_csn == Tx::RolledBackCSN` on disk
  `Committed`   — `creation_csn` is a known committed CSN on disk
  `NoMetadata`  — `txn_version.txt` absent (prehistoric/non-transactional part)
  `UnknownCSN`  — file present, `creation_csn == Tx::UnknownCSN` (in-flight)
  `Unreadable`  — file present but could not be parsed (corruption/partial write)

Intersection decision table (applied to both prev/next branches):

  `Unreadable` on either side       → `CORRUPTED_DATA` (not a code bug)
  incoming `RolledBack`             → skip incoming
  existing `RolledBack`             → evict existing; collect orphans + incoming
                                       sorted by `(level, mutation)` desc;
                                       re-add all via `add()`; `return`
  either `UnknownCSN`               → skip incoming (`loadDataPart` resolves later)
  both `NoMetadata` or `Committed`  → `LOGICAL_ERROR` (genuine bug / manual edit)

The `traverse` template body is moved inline into the header and `PartLoadingTree`
is made `public` so external tests can exercise it directly.

Add `tests/queries/0_stateless/04063_part_loading_tree_rollback_reparent.sh`.
The test manually constructs four part directories:

  `all_1_2_2_1` (1-2, L2, M1) — rolled-back parent; `txn_version.txt` holds
                                  `Tx::RolledBackCSN`; `creation_tid` uses
                                  `start_csn=2` (not `Tx::PrehistoricCSN`)
                                  so `wasInvolvedInTransaction` returns true
  `all_1_1_1_1` (1-1, L1, M1) — committed child; contained in 1-2 → inserted
                                  as a child of `all_1_2_2_1` during build
  `all_2_3_1_0` (2-3, L1, M0) — committed part; intersects `all_1_2_2_1`,
                                  triggering the eviction and re-parenting path

After `ATTACH TABLE`: `all_1_1_1_1` must be active (orphan successfully
re-parented) and `all_1_2_2_1` must not be active (rolled-back part discarded).

`txn_version.txt` is written atomically (`printf` to `.tmp` then `mv`) to
eliminate a stress-test race where the server reads the file in the window
between shell truncation and the actual write, producing an empty file and
a parse error (`Unreadable` state → `CORRUPTED_DATA` instead of server abort).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nfo` format

The test writes `txn_version.txt` directly to mark `all_1_2_2_1` as rolled back.
Master switched to `VersionInfo::readFromMultiLineBuffer`, which has a silent
fallback for the old format that overrides `creation_csn` with
`Tx::NonTransactionalCSN`. As a result the test's `creation_csn: 18446744073709551615`
(`Tx::RolledBackCSN`) was ignored, `read_txn_status` returned `Committed` for
both intersecting parts, and `ATTACH TABLE` threw `LOGICAL_ERROR` again.

Write the full new format including `storing_version`, `removal_tid` and
`removal_csn` so the rolled-back state is detected.

Also make the test stricter:
- `ATTACH TABLE` exit code is now checked (previously stderr was suppressed
  and a thrown exception was silently ignored).
- The presence-of-active checks for `all_1_1_1_1`, `all_1_2_2_1` and
  `all_2_3_1_0` are now explicit `count` comparisons that exit with a clear
  `FAIL:` message when they don't match.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/PRs/100992/c7f836579347baebec2d4a153b9c4a540d236ac0/fast_test/
@tuanpach tuanpach force-pushed the fix-part-loading-tree-rollback-detection branch from c7f8365 to 4ff63d0 Compare May 17, 2026 10:03
Comment thread src/Storages/MergeTree/MergeTreeData.cpp
…loss

Addresses ClickHouse#100992 (comment)

Previously, when an intersecting tree-resident part had `creation_csn = Tx::UnknownCSN`
on disk, the incoming committed peer was dropped from the loading tree. The plan
was that `loadDataPart` would resolve the CSN afterwards. But `loadDataPart` only
runs on the top-level nodes of the loading tree, and the dropped peer was never
re-introduced — so if the in-flight transaction turned out to be rolled back, the
committed peer was permanently absent for that startup (data only became visible
after a second restart, once `Tx::RolledBackCSN` was persisted to disk).

`read_txn_status` now consults `TransactionLog` when the on-disk CSN is unresolved,
mirroring the logic of `VersionMetadata::tryGetCSN`: if the tid has no CSN and no
running transaction, it's treated as rolled back. The `UnknownCSN` `RollbackStatus`
arm now only fires for transactions that are genuinely still running (an unusual
case during `loadDataParts`), and the rolled-back arm correctly evicts the dead
part and keeps the committed peer.
tuanpach added 2 commits May 20, 2026 12:10
…rtLoadingTree::add`

Addresses ClickHouse#100992 (comment)

A committed part can be silently buried under a rolled-back ancestor when the
ancestor contains it (rather than just intersects). The previous code only
checked rollback status in the `isDisjoint == false` (intersection) branch, so
when an incoming committed part descended through the `contains` branch into a
rolled-back ancestor it became a "covered" child and was loaded as `Outdated`
instead of active — invisible to queries until the next restart resolved the
ancestor's `creation_csn` to `Tx::RolledBackCSN` on disk.

Add the same eviction check in both `contains` branches (`prev` and `next`).
On match, the helper `evict_rolled_back_and_reinsert` collects the rolled-back
node's orphaned descendants, sorts them with the incoming part by
`(level, mutation)` descending, and re-inserts the lot — so any new
top-level ancestor is added before its descendants and re-parenting happens
cleanly.

Also extract the inline eviction code from the two intersection branches into
the same helper, removing the duplicated logic.

New regression test `04241_part_loading_tree_rollback_contains` exercises the
`R-contains-{C, N}` topology that triggered a `LOGICAL_ERROR` (or silent data
loss, depending on the levels) before this fix.
…NFO`

`02497_source_part_is_intact_when_mutation` failed in Fast test
(https://s3.amazonaws.com/clickhouse-test-reports/PRs/100992/7587b5c5101032dfd1b873e780b42e1230af45d1/fast_test/)
because the test's `stderr` check tripped on `LOG_WARNING` messages emitted
when `DETACH` + `ATTACH` reloads the rolled-back mutation parts written by the
test. With the eviction now firing in the `contains` branches (added in the
previous commit), this happens for every rolled-back mutation chain — which is
routine cleanup, not a warning condition.

Also rewrite the eviction message to drop the misleading "committed part Y
will be used instead" clause, since the incoming part can itself be rolled
back (and immediately re-evicted by the recursive `add` call).
Comment thread src/Storages/MergeTree/MergeTreeData.cpp
tuanpach added 2 commits May 26, 2026 07:28
Addresses https://github.com/ClickHouse/ClickHouse/actions/runs/26031517236
(`test_transactions/test.py::test_rollback_unfinished_on_restart{1,2}`
fail with `len(res) == 5` because the rolled-back mutation parts are
not in `system.parts`) and
ClickHouse#100992 (comment)
(bot flags an `UnknownCSN` race in the same branch).

The prior commit (`7587b5c5101`) added a rollback check in the `contains`
branches and evicted the ancestor when found rolled-back. But the rest of
the load path already handles this — `MergeTreeData::loadDataPartsFromDisk`
detects that a top-level part was demoted to `Outdated` (e.g. by
`loadDataPart`'s `creation_csn == Tx::RolledBackCSN` check) and pushes its
children onto `parts_to_load` to be loaded as `Active`. So the eviction in
`PartLoadingTree::add` was redundant, and worse it removed the rolled-back
parts from `system.parts` entirely, which the integration test relies on.

This commit keeps:
- the intersection-branch eviction (the original PR's fix — there the
  promotion mechanism does not apply because the committed peer is a
  sibling, not a child, of the rolled-back tree-resident);
- the eager `UnknownCSN` resolution in `read_txn_status`;
- the helper `evict_rolled_back_and_reinsert` (still used by both
  intersection branches);
- `04241_part_loading_tree_rollback_contains` — the test now passes via
  the existing promotion mechanism rather than tree-level eviction, but
  the assertion remains the same (committed children are active).
…nfo` format

The rolled-back ancestor's `creation_tid` was `(2, 1, ...)`, but `local_tid=1`
is `Tx::NonTransactionalLocalTID`, so `isNonTransactional()` returned true
and `wasInvolvedInTransaction()` returned false. As a result `loadDataPart`
skipped the `creation_csn == Tx::RolledBackCSN` check at
`src/Storages/MergeTree/MergeTreeData.cpp` line 2227, the ancestor stayed
`Active`, and the line 2374-2381 promotion never fired — so the committed
descendants remained `Outdated` and the test's `check_active_count` assertions
failed in CI Fast test:
https://s3.amazonaws.com/clickhouse-test-reports/PRs/100992/a59409363103147b8348d73ae8d7a919ef6e67e8/fast_test/

Use `local_tid=33` (above `Tx::MaxReservedLocalTID=32`) so the TID looks
transactional. Also add a comment explaining the constraint so the next
person who copies this idiom doesn't repeat the mistake.

`04063` happens to pass with the same bad TID because there the
intersection-branch eviction uses `read_txn_status` (reads `creation_csn`
directly from disk) and never goes through `wasInvolvedInTransaction`.
Comment thread tests/queries/0_stateless/04063_part_loading_tree_rollback_reparent.sh Outdated
@clickhouse-gh

clickhouse-gh Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

📊 Cloud Performance Report

⚠️ AI verdict: not_sure3 query(s) regressed out of 39 analysed

The change here only rewrites how the server decides which on-disk parts to load at startup/ATTACH when rolled-back transaction metadata is present — code that runs once during part loading and never during query execution. The 12 flagged ClickBench regressions are all read queries that don't touch that path, and they shift in the same direction by ~11-49% across the board, which (with only 5 source runs vs 30 base) reads as a slower/noisier measurement environment rather than a real effect of this change. Nine off-path queries are downgraded to not-sure. Q16 (+27%), Q31 (+25%), and Q32 (+49%) have deltas too large to dismiss automatically and are kept for a human to confirm against the run environment, but no part of this diff can plausibly explain them.

clickbench

🔴 3 regressed · ⚠️ 9 inconclusive

Flagged queries (12 of 43)
Query Verdict Baseline med (ms) PR med (ms) Change q-value Hint
🔴 16 regression 820 1039 +26.7% <0.0001 +27% large delta forces keep, but PR only touches startup part-loading; verify environment before acting.
🔴 31 regression 320 401 +25.3% <0.0001 +25% delta forces keep, but PR only touches startup part-loading; likely environment — verify before acting.
🔴 32 regression 1403 2094 +49.3% <0.0001 +49% large delta forces keep, yet PR only touches startup part-loading; investigate run environment.
⚠️ 12 not_sure 331 382 +15.3% <0.0001 PR only changes startup part-loading rollback logic; SELECT execution path untouched. +15% looks like environment varian
⚠️ 13 not_sure 510 620 +21.6% <0.0001 Part-loading-tree change can't touch this read path; +22% consistent with a slower/noisier run, not the PR.
⚠️ 14 not_sure 394 462 +17.3% <0.0001 Startup ATTACH-time code only; not on the query hot path. Off-PR-path delta.
⚠️ 15 not_sure 247 298 +20.4% <0.0001 PR touches rolled-back part loading at startup, not SELECT execution. Unrelated to this query.
⚠️ 17 not_sure 575 704 +22.4% <0.0001 Off-PR-path (startup part-loading) and query is intrinsically noisy; delta not attributable to this PR.
⚠️ 18 not_sure 1439 1756 +22.1% <0.0001 Noisy query plus off-path PR (startup-only change); +22% is run-to-run variance.
⚠️ 30 not_sure 245 272 +11.0% <0.0001 PR changes part-loading at startup, not query execution; +11% is environment noise.
⚠️ 33 not_sure 1531 1817 +18.7% <0.0001 Startup part-loading change can't affect this SELECT; +19% is off-path variance.
⚠️ 34 not_sure 1527 1834 +20.1% <0.0001 Off-PR-path (startup-only); +20% consistent with a slower measurement run, not the PR.

q-value = BH-FDR adjusted p; smaller is stronger evidence. MIRAI flags a query when q < fdr_q (default 0.10) — the value the verdict is based on.

tpch_adapted_1_official

🟢 No significant changes

Debug info
  • StressHouse run: e254fe0e-fc6c-4f41-a561-fddf6be739f1
  • MIRAI run: 310c7d9e-d978-40b9-9a92-ac0f246aeae1
  • PR check IDs:
    • clickbench_298601_1780484653
    • clickbench_298607_1780484653
    • clickbench_298626_1780484653
    • tpch_adapted_1_official_298640_1780484653
    • tpch_adapted_1_official_298643_1780484653

alexey-milovidov and others added 6 commits June 29, 2026 07:54
…gTree::add

When an incoming committed (or non-transactional) part intersects a
tree-resident part whose creating transaction is still in flight
(`UnknownCSN`), `PartLoadingTree::add` previously skipped the incoming
part. If that in-flight transaction later rolled back, the resident part
became `Outdated` while the committed peer had already been discarded for
this load pass, losing committed data.

Two intersecting parts can never both commit, so an in-flight part that
intersects an already-committed peer is bound to roll back. Evict the
in-flight resident and keep the committed incoming part instead of
discarding it. The symmetric case (incoming in flight, resident committed)
and the both-in-flight case still skip the incoming part, which is safe.

The eviction helper is renamed `evict_and_reinsert` and takes a reason
string so the log message stays accurate for both eviction causes.

This case is not reachable during normal part loading (no transaction can
be creating parts in a table that is still loading its parts, so the eager
`TransactionLog` resolution turns every on-disk `UnknownCSN` into
`RolledBack`/`Committed`), but the branch now fails safe toward keeping
committed data.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The rolled-back fixture wrote `creation_tid: (2, 1, ...)`, but `local_tid = 1`
is `Tx::NonTransactionalLocalTID`. With `start_csn = 2` this violates the
`TransactionID` invariant `(local_tid == NonTransactionalLocalTID) ==
(start_csn == NonTransactionalCSN)`, which trips a `chassert` in
`TransactionID::isNonTransactional` in debug/sanitizer builds, and also makes
`wasInvolvedInTransaction` treat the part as non-transactional. The test
only passed because the part is evicted before that path runs.

Use a transactional `local_tid` outside the reserved range (33), matching
test 04241, so the fixture is well-formed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When `PartLoadingTree::add` evicts a rolled-back part that intersects an
incoming committed part, it re-inserts the orphaned descendants and the
incoming part sorted by (level, mutation) descending. The new test covers
the case the sort exists for: one re-inserted committed part contains
another. Because `MergeTreePartInfo::contains` requires a strictly higher
level for a wider block range, the container is always re-added before the
part it contains, so the contained part descends into it instead of hitting
the generic intersection branch (which does not handle
`incoming.contains(existing)`) and throwing a `LOGICAL_ERROR`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
return RollbackStatus::NoMetadata;

auto version_path = fs::path(relative_data_path) / part_name / VersionMetadata::TXN_VERSION_METADATA_FILE_NAME;
if (!part_disk->existsFile(version_path))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VersionMetadataOnDisk::loadMetadata already treats a lone txn_version.txt.tmp as a rolled-back part (the interrupted-rename / lost-fsync cases), but this fast path only checks for the final txn_version.txt and returns NoMetadata otherwise.

That means a tmp-only transactional leftover is misclassified as a non-transactional part during tree construction. If it intersects a committed peer on startup or ATTACH, we still fall through to the generic LOGICAL_ERROR exception path instead of evicting the rolled-back leftover, so the interrupted-write state handled later by VersionMetadataOnDisk can still fail loading here.

Could we mirror the .tmp handling from VersionMetadataOnDisk::loadMetadata (or share a helper) so tmp-only metadata is treated as rolled back?

@clickhouse-gh

clickhouse-gh Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.50% 85.50% +0.00%
Functions 92.60% 92.60% +0.00%
Branches 77.70% 77.70% +0.00%

Changed lines: Changed C/C++ lines covered: 56/123 (45.53%) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants