Fix flaky test 03800_use_const_adaptive_granularity_vertical_merge by nerve-bot · Pull Request #100641 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix flaky test 03800_use_const_adaptive_granularity_vertical_merge#100641

Merged
pufit merged 2 commits into
ClickHouse:masterfrom
nerve-bot:fix/flaky-test-03800-optimize-final
Mar 26, 2026
Merged

Fix flaky test 03800_use_const_adaptive_granularity_vertical_merge#100641
pufit merged 2 commits into
ClickHouse:masterfrom
nerve-bot:fix/flaky-test-03800-optimize-final

Conversation

@nerve-bot

@nerve-bot nerve-bot commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

Use OPTIMIZE TABLE ... FINAL instead of bare OPTIMIZE TABLE.

Without FINAL, the optimize() code path calls select_without_hint(), which checks max_source_parts_bytes_for_merge from CompactionStatistics::getMaxSourcePartsBytesForMerge(). When the background merge pool is saturated by parallel tests (the CI runs dozens of stateless tests concurrently), this value drops to zero and OPTIMIZE TABLE silently returns without merging anything — the parts stay as-is.

With FINAL, the code path goes through select_in_partition()selectAllPartsToMergeWithinPartition(), which bypasses the merge-selector resource check entirely and also retries with a wait loop if parts are currently being merged by background threads.

The test's purpose is verifying use_const_adaptive_granularity with vertical merge — the OPTIMIZE is just a means to trigger the merge, not the subject under test. The nested variant of this test (03800_..._nested.sql.j2) already uses OPTIMIZE TABLE ... FINAL.

Reproduction: adding max_bytes_to_merge_at_max_space_in_pool = 0 to the table's SETTINGS (simulating a saturated merge pool) makes the original test fail deterministically with the exact diff seen in CI; the fixed version passes under the same conditions.

13 failures across 4 unrelated PRs + master over the last 30 days.

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

...

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Version info

  • Merged into: 26.4.1.275

Use OPTIMIZE TABLE ... FINAL instead of bare OPTIMIZE TABLE.

Without FINAL, OPTIMIZE TABLE uses the regular merge selector which
can silently be a no-op when the merge pool is saturated (returns
max_source_parts_bytes_for_merge == 0) in parallel test environments.
With FINAL, the code bypasses the merge selector via
selectAllPartsToMergeWithinPartition and retries if parts are
currently being merged, making it reliable under load.

The test's purpose is verifying use_const_adaptive_granularity with
vertical merge — the OPTIMIZE is just a means to trigger the merge,
not the subject under test. The nested variant of this test already
uses OPTIMIZE TABLE ... FINAL.

13 failures across 4 unrelated PRs + master in the last 30 days.
@CLAassistant

CLAassistant commented Mar 24, 2026

Copy link
Copy Markdown

@pufit pufit added the can be tested Allows running workflows for external contributors label Mar 24, 2026
@clickhouse-gh

clickhouse-gh Bot commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [904299f]

Summary:


AI Review

Summary

This PR updates 03800_use_const_adaptive_granularity_vertical_merge to use OPTIMIZE TABLE ... FINAL in both .sql and .reference, making the test deterministic under saturated background merge conditions. The change is narrowly scoped to test reliability, consistent with the PR motivation, and I did not find correctness, safety, performance, or compliance problems in the modified lines.

ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
Final Verdict
  • Status: ✅ Approve

@pufit pufit added this pull request to the merge queue Mar 26, 2026
Merged via the queue into ClickHouse:master with commit 5a39631 Mar 26, 2026
153 checks passed
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants