Stop refreshable MV gracefully on Keeper without MULTI_READ on ATTACH/restore by groeneai · Pull Request #108441 · ClickHouse/ClickHouse · GitHub
Skip to content

Stop refreshable MV gracefully on Keeper without MULTI_READ on ATTACH/restore#108441

Merged
alexey-milovidov merged 3 commits into
ClickHouse:masterfrom
groeneai:groeneai/fix-refreshtask-multiread-attach-106737
Jun 29, 2026
Merged

Stop refreshable MV gracefully on Keeper without MULTI_READ on ATTACH/restore#108441
alexey-milovidov merged 3 commits into
ClickHouse:masterfrom
groeneai:groeneai/fix-refreshtask-multiread-attach-106737

Conversation

@groeneai

@groeneai groeneai commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fixed a server crash (LOGICAL_ERROR: 'Unexpected exception in refresh scheduling') that could happen on startup when a coordinated refreshable materialized view in a Replicated database was attached on a Keeper that does not support the MULTI_READ feature flag (for example after a Keeper downgrade). The view is now stopped gracefully instead of aborting the server.

Description

Closes: #106737 (STID 2508-3e7b)

A coordinated refreshable materialized view used to abort the server when re-attached on a Keeper without MULTI_READ. The constructor's up-front MULTI_READ / CREATE_IF_NOT_EXISTS check was gated behind the fresh-CREATE path (!attach && !is_restore_from_backup) inside the !replica_path_existed branch, so on ATTACH / restore / existing-replica the view stayed coordinated = true. The scheduling thread then threw NOT_IMPLEMENTED in readZnodesIfNeeded (RefreshTask.cpp:1378), and the doScheduling catch-all re-raised it as a LOGICAL_ERROR (RefreshTask.cpp:936), which aborts the server in debug/sanitizer builds and crash-loops on restart.

Inner exception and stack from the reported run (amd_msan):

<Error> RefreshTask(...): Unexpected exception in refresh scheduling. The view will be stopped.: Code: 48. DB::Exception: Keeper server doesn't support multi-reads. Refreshable materialized views won't work. (NOT_IMPLEMENTED)
  5. src/Storages/MaterializedView/RefreshTask.cpp:1378  DB::RefreshTask::readZnodesIfNeeded(...)
  6. src/Storages/MaterializedView/RefreshTask.cpp:698   DB::RefreshTask::doScheduling(bool)
<Fatal> : Logical error: 'Unexpected exception in refresh scheduling'.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=103786&sha=04c29e862d95e40e250aff7eb504de5d1af13d33&name_0=PR&name_1=Stress%20test%20%28amd_msan%29

Fix

The feature-flag check is moved before the znode-create block, so it runs on every coordinated path:

  • Fresh CREATE still rejects with NOT_IMPLEMENTED (unchanged behavior).
  • ATTACH / restore / existing-replica now disable coordination, request stop, and record the reason. readZnodesIfNeeded returns early (!coordinated), so the scheduling thread never reaches the throw, and the view ends up Disabled instead of crashing the server.

This is the approach prescribed by the maintainer in the issue (detect the missing MULTI_READ up front; fresh CREATE keeps rejecting, ATTACH/restore must not throw).

Test

Adds an integration test test_refreshable_mv_no_multi_read: it creates a coordinated refreshable MV in a Replicated database with MULTI_READ enabled, then restarts the server on a Keeper with MULTI_READ disabled. Without the fix the server crash-loops with the LOGICAL_ERROR above; with the fix the server stays up and the view reports Disabled with the reason in system.view_refreshes.exception.

Version info

  • Merged into: 26.7.1.256

@groeneai

Copy link
Copy Markdown
Contributor Author

@groeneai

Copy link
Copy Markdown
Contributor Author

cc @al13n321 @tuanpach could you review? Moves the up-front Keeper feature-flag check in the RefreshTask constructor to cover the ATTACH/restore/existing-replica paths, so a coordinated refreshable MV on a Keeper without MULTI_READ stops gracefully (Disabled) instead of throwing on the scheduling thread and aborting the server (#106737).

@clickhouse-gh

clickhouse-gh Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [5aa9c52]

Summary:


AI Review

Summary

This PR moves the Keeper MULTI_READ / CREATE_IF_NOT_EXISTS validation for coordinated refreshable materialized views so ATTACH, restore, and existing-replica startup paths record a non-resumable unavailable state and stay Disabled instead of surfacing the scheduler LOGICAL_ERROR. The current head keeps coordination.coordinated true, blocks local resume paths, and adds focused integration coverage for the downgrade/restart path and SYSTEM START VIEW; the two prior bot threads are addressed in the current code. I did not find unresolved code-level findings.

PR Metadata

💡 The Bug Fix category is appropriate and the changelog entry is specific. Optional wording adjustment: ClickHouse text normally describes LOGICAL_ERROR cases as exceptions rather than server-level failures. Suggested changelog entry:

Fixed a LOGICAL_ERROR: 'Unexpected exception in refresh scheduling' exception that could happen on startup when a coordinated refreshable materialized view in a Replicated database was attached on a Keeper that does not support the MULTI_READ feature flag, for example after a Keeper downgrade. The view is now stopped gracefully instead of surfacing the scheduler exception.

Final Verdict

Status: ✅ Approve

No required code changes.

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Jun 25, 2026
@murphy-4o murphy-4o added the can be tested Allows running workflows for external contributors label Jun 25, 2026
@murphy-4o murphy-4o assigned murphy-4o and unassigned murphy-4o Jun 25, 2026
Comment thread src/Storages/MaterializedView/RefreshTask.cpp Outdated
@murphy-4o murphy-4o self-assigned this Jun 25, 2026
…/restore

A coordinated refreshable materialized view (in a Replicated database) crashed
the server when re-attached on a Keeper that lacks the MULTI_READ feature flag,
e.g. after a Keeper downgrade. The constructor's up-front MULTI_READ /
CREATE_IF_NOT_EXISTS check was gated behind the fresh-CREATE path
(!attach && !is_restore_from_backup) inside the !replica_path_existed branch, so
ATTACH / restore / existing-replica kept coordination enabled. The scheduling
thread then threw NOT_IMPLEMENTED in readZnodesIfNeeded, which the doScheduling
catch-all re-raised as a LOGICAL_ERROR, aborting the server (and crash-looping on
restart).

Move the feature-flag check before the znode-create block so it runs on every
coordinated path. Fresh CREATE still rejects with NOT_IMPLEMENTED. On
ATTACH / restore / existing-replica the view now enters a permanent,
non-resumable "coordination unavailable" state: a new coordination.unavailable
flag is set, the task is stopped, and the reason is recorded. doScheduling
short-circuits to Disabled before touching Keeper, so readZnodesIfNeeded is never
reached and the server stays up.

coordination.coordinated is left true on purpose. Clearing it would turn an
originally coordinated view into an uncoordinated one, and a later
SYSTEM START VIEW or finalizeRestoreFromBackup would then resume it as a local
refresh, which breaks the replicated target table of a non-APPEND view in a
Replicated database. start() and finalizeRestoreFromBackup() now also check the
unavailable flag and refuse to resume, so the view stays Disabled until the table
is re-created on a Keeper that supports the required feature flags.

Adds an integration test that creates a coordinated refreshable MV with MULTI_READ
enabled, restarts on a Keeper with MULTI_READ disabled and asserts the server
stays up with the view Disabled, and then asserts SYSTEM START VIEW does not
resume an uncoordinated local refresh.

Closes: ClickHouse#106737

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai groeneai force-pushed the groeneai/fix-refreshtask-multiread-attach-106737 branch from 98c1022 to 67a1d74 Compare June 25, 2026 07:42
@groeneai

Copy link
Copy Markdown
Contributor Author

Updated after the review feedback (the fix now uses a non-resumable coordination.unavailable state instead of clearing coordination.coordinated).

Pre-PR validation gate (click to expand)
# Question Answer
a Deterministic repro? Yes. Integration test tests/integration/test_refreshable_mv_no_multi_read/: coordinated RMV in a Replicated DB, restart on a Keeper with multi_read=0. Unfixed binary crash-loops with the exact #106737 signature (RefreshTask.cpp readZnodesIfNeeded NOT_IMPLEMENTED -> doScheduling -> catch-all LOGICAL_ERROR -> abort).
b Root cause explained? The ctor MULTI_READ/CREATE_IF_NOT_EXISTS check was gated to fresh CREATE only, so on ATTACH/restore/existing-replica a coordinated view kept coordinated=true on a Keeper without MULTI_READ. The scheduling thread then threw NOT_IMPLEMENTED in readZnodesIfNeeded, re-raised by the doScheduling catch-all as a LOGICAL_ERROR -> server abort. The review also found that the first fix (clearing coordinated) demoted the view to uncoordinated, so SYSTEM START VIEW/finalizeRestoreFromBackup could resume it as a local refresh and corrupt the replicated target.
c Fix matches root cause? Yes. The feature-flag check runs on every coordinated path. Fresh CREATE still throws. ATTACH/restore enters a permanent non-resumable coordination.unavailable state; doScheduling short-circuits to Disabled before any Keeper access, so the throw is never reached. coordinated stays true, so the view is never demoted to a local refresh; start()/finalizeRestoreFromBackup() refuse to resume.
d Test intent preserved / new tests added? New integration test. Asserts (1) server stays up + view Disabled + exception mentions multi-read + no Unexpected exception in refresh scheduling, and (2) SYSTEM START VIEW after the graceful stop keeps it Disabled (does not resume an uncoordinated local refresh).
e Both directions demonstrated? Yes. Fixed binary: test PASSES. Original master (Build ID 26ca63af): crash-loops with the #106737 signature (test FAILS). Previous coordinated=false approach: test FAILS on the new step (SYSTEM START VIEW resumes to Scheduled).
f Fix is general across code paths? Yes. The single unavailable flag is honored by all resume/scheduling entry points: doScheduling, start() (SYSTEM START VIEW), finalizeRestoreFromBackup() (restore auto-resume). drop() keeps working since coordinated stays true and it uses only single-key/multi-write Keeper ops (no MULTI_READ).
g Fix generalizes across inputs? The trigger is a Keeper capability (missing MULTI_READ or CREATE_IF_NOT_EXISTS), not a datatype. Both missing flags take the same path; both attach and restore-from-backup are covered; fresh CREATE keeps rejecting. No type-wrapper dimension applies.
h Backward compatible? Yes. No setting default, on-disk, wire, or metadata format change. Fresh CREATE behavior is unchanged (still rejects). The only behavior change is on the previously-crashing ATTACH/restore path, which now stays up Disabled.
i Invariants and contracts preserved? Yes. Key invariant: a coordinated (non-APPEND, Replicated-DB) RMV must never run an uncoordinated local refresh, since StorageMaterializedView requires coordination for the replicated target. Leaving coordinated=true + making the state non-resumable upholds it on all paths (scheduling, SYSTEM START VIEW, restore finalize). doScheduling holds mutex across the early Disabled return; no Keeper access on the unavailable path.

Session id: cron:clickhouse-worker-slot-2:20260625-071500

Comment thread tests/integration/test_refreshable_mv_no_multi_read/test.py
…ti-read test

SYSTEM REFRESH VIEW is asynchronous: RefreshTask::run sets the state to
Scheduling synchronously and the background scheduler only later reaches the
coordination.unavailable branch and switches it back to Disabled. Reading
system.view_refreshes right after the refresh request could therefore observe
the transient Scheduling state and fail the assertion spuriously.

Add SYSTEM WAIT VIEW after the refresh request (and after the downgrade restart,
which schedules the same graceful-stop pass asynchronously) so the status read
always waits for the scheduled pass. WAIT VIEW returns immediately once the view
is Disabled, so it does not block on the coordination-unavailable view.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

Copy link
Copy Markdown
Contributor Author

CI finish ledger — cdabef4

Every failure below has an owner. The single failure is unrelated to this PR's RefreshTask/Keeper diff.

Check / test Reason Owner
Stress test (arm_release) / S3_ERROR "No such key thrown" (CacheMetadata background download) chronic trunk stress (filesystem-cache background-download vs concurrent object removal; ~8 unrelated PRs + master over 30d) a fix task is created (investigating at full effort — fixing PR link to follow here)

This PR changes only RefreshTask.cpp/.h and a refreshable-MV integration test, which cannot touch the S3 object-storage / filesystem-cache path.

Session id: cron:our-pr-ci-monitor:20260625-130000

@groeneai

Copy link
Copy Markdown
Contributor Author

Fixing PR for the Stress test S3_ERROR No such key thrown (CacheMetadata background download) failure created: #108489

@alexey-milovidov alexey-milovidov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable.

However, the case when it happens is very narrow - likely it happens only in tests.

Still good to merge.

@alexey-milovidov alexey-milovidov self-assigned this Jun 28, 2026
@groeneai groeneai mentioned this pull request Jun 29, 2026
1 task
@alexey-milovidov

Copy link
Copy Markdown
Member

@groeneai, provide a fix for S3_ERROR No such key thrown (see clickhouse-server.log or no_such_key_errors.txt), and if it is already fixed - update the branch here.

@groeneai

Copy link
Copy Markdown
Contributor Author

The S3_ERROR No such key thrown in the Stress test is the chronic filesystem-cache background-download race: CacheMetadata prefetches an object that is concurrently removed during stress, logging a benign "Error during background download ... No such key". It is not caused by this PR, which only touches RefreshTask and Keeper (no S3 or filesystem-cache code).

It is already fixed by #108489 (merged 2026-06-28), which adds that benign line to the stress harness allowlist in tests/docker_scripts/stress_tests.lib. I have updated this branch onto current master (head now 5aa9c52) so the Stress test picks up the allowlist on re-run.

@clickhouse-gh

clickhouse-gh Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.40% 85.40% +0.00%
Functions 92.60% 92.60% +0.00%
Branches 77.70% 77.60% -0.10%

Changed lines: Changed C/C++ lines covered by tests: 30/32 (93.75%) | Lost baseline coverage: none · Uncovered code

Full report · Diff report

@groeneai

Copy link
Copy Markdown
Contributor Author

CI finish ledger — 5aa9c52

After merging master to pick up the merged stress-harness fix #108489, CI fully finished green on this head. The only red check is the external private mirror sync.

Check / test Reason Owner / fixing PR
Sync private mirror CH Inc sync (private, not actionable)

The previously-tracked Stress test (arm_release) / S3_ERROR No such key line is resolved: fixing PR #108489 (ours, merged 2026-06-28) is now on this branch via the master merge.

Session id: cron:our-pr-ci-monitor:20260629-080000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-bugfix Pull request with bugfix, not backported by default pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Logical error: 'Unexpected exception in refresh scheduling' (STID: 2508-3e7b)

4 participants