Enable transactions on encrypted disks in CI by tuanpach · Pull Request #104780 · ClickHouse/ClickHouse · GitHub
Skip to content

Enable transactions on encrypted disks in CI#104780

Merged
tuanpach merged 5 commits into
ClickHouse:masterfrom
tuanpach:enable-transactions-on-encrypted-disks
May 25, 2026
Merged

Enable transactions on encrypted disks in CI#104780
tuanpach merged 5 commits into
ClickHouse:masterfrom
tuanpach:enable-transactions-on-encrypted-disks

Conversation

@tuanpach

@tuanpach tuanpach commented May 13, 2026

Copy link
Copy Markdown
Member

After #92141 refactored version metadata to use an atomic tmp+rename write pattern instead of WriteMode::Append, the race that previously corrupted txn_version.txt on encrypted disks is structurally impossible: VersionMetadataOnDisk::storeInfoToDataPartStorage always writes the full file via createFile + writeFile + replaceFile, serialized through persisted_info_mutex, and never appends.

The original failure that motivated disabling transactions on encrypted disks (see #59898 discussion) was the interleaving of appendCSNToVersionMetadata and appendRemovalTIDToVersionMetadata on cached_s3_encrypted, which produced a txn_version.txt whose creation_csn: line had been overwritten with garbage and triggered an assertHasValidVersionMetadata fatal. Both append call sites were deleted by #92141.

This PR:

  • Removes the USE_ENCRYPTED_STORAGE guards around installing transactions.xml in tests/config/install.sh (single-server and USE_DATABASE_REPLICATED two-server cases).
  • Removes the now-obsolete no-encrypted-storage tag from 30 transaction-related stateless tests, and the matching explanatory comment blocks in 03812_truncate_optimize_in_transaction.sql and 03920_drop_db_hang_with_trx.sh.

04098_asterisk_include_virtual_columns_mergetree.sql keeps its no-encrypted-storage tag — that tag is unrelated to transactions.

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Enable transactions on encrypted disks in CI now that #92141 eliminated the append-based version-metadata writes that previously corrupted txn_version.txt on encrypted storage.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Version info

  • Merged into: 26.6.1.127

After ClickHouse#92141 refactored version metadata to use atomic tmp+rename writes
instead of `WriteMode::Append`, the race that previously corrupted
`txn_version.txt` on encrypted disks (see ClickHouse#59898 discussion) is no longer
possible: concurrent `creation_csn` and `removal_tid` updates can no longer
interleave inside the encrypted append path.

Install `transactions.xml` in encrypted-storage CI jobs and drop the
`no-encrypted-storage` tag from transaction-related stateless tests.
@clickhouse-gh

clickhouse-gh Bot commented May 13, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-ci label May 13, 2026
tuanpach added 3 commits May 13, 2026 10:41
`VersionMetadata::isVisible` is called per part per read; in heavy CI
configs it pushed `02435_rollback_cancelled_queries` and
`01168_mutations_isolation_2` over the 180s flaky-check timeout on the
encrypted-S3 + ASAN+UBSAN variant. Demote the two CSN lookup logs to
LOG_TEST so they no longer fire under standard debug logging.
The test inserts 4M+ rows in total across initial setup, parallel
insert/select/cancel phase, and final verification. On the
encrypted-S3 + ASan+UBSan + meta-in-keeper flaky-check variant it
consistently runs 250-300s, exceeding the 180s default budget. The
`long` tag raises the budget to 600s, which fits even the slowest
observed runs with substantial headroom.
On the encrypted-S3 + ASan+UBSan + meta-in-keeper flaky-check variant
the per-worker 400s window only fits ~10 iterations of
`select_insert_action` instead of the required 15, causing the test's
internal "not enough iterations" assertion to fire. The correctness
checks inside each iteration are unchanged; only the throughput floor
is lowered from 15 to 5 so the test still validates isolation on slow
storage variants without flagging environmental slowdown as a bug.
@antonio2368 antonio2368 self-assigned this May 14, 2026
@clickhouse-gh

clickhouse-gh Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.10% +0.00%
Functions 92.00% 92.00% +0.00%
Branches 76.60% 76.60% +0.00%

Changed lines: 100.00% (9/9) · Uncovered code

Full report · Diff report

@tuanpach tuanpach added this pull request to the merge queue May 25, 2026
Merged via the queue into ClickHouse:master with commit df88b12 May 25, 2026
167 checks passed
@tuanpach tuanpach deleted the enable-transactions-on-encrypted-disks branch May 25, 2026 13:25
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-synced-to-cloud The PR is synced to the cloud repo label May 25, 2026
DavidHe-2008 pushed a commit to DavidHe-2008/ClickHouse that referenced this pull request Jun 1, 2026
…s-on-encrypted-disks

Enable transactions on encrypted disks in CI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants