PR: support simple views with union by devcrafter · Pull Request #100958 · ClickHouse/ClickHouse · GitHub
Skip to content

PR: support simple views with union#100958

Merged
devcrafter merged 120 commits into
masterfrom
pr-view-with-union-pushdown
Apr 19, 2026
Merged

PR: support simple views with union#100958
devcrafter merged 120 commits into
masterfrom
pr-view-with-union-pushdown

Conversation

@devcrafter

@devcrafter devcrafter commented Mar 27, 2026

Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Implements support of parallel replicas over simple views (including eligible UNION ALL views over MergeTree tables) when parallel_replicas_allow_view_over_mergetree=1. This allows to parallelize view's outer query instead of inner one which increases query parallelization across nodes.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Version info

  • Merged into: 26.4.1.1060

@clickhouse-gh

clickhouse-gh Bot commented Mar 27, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-performance Pull request with some performance improvements label Mar 27, 2026
Comment thread src/Storages/MergeTree/ParallelReplicasReadingCoordinator.cpp Outdated
Comment thread src/Storages/MergeTree/ParallelReplicasReadingCoordinator.cpp Outdated
Comment thread src/Storages/MergeTree/ParallelReplicasReadingCoordinator.cpp
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread src/Storages/MergeTree/ParallelReplicasReadingCoordinator.cpp Outdated
@devcrafter devcrafter force-pushed the pr-view-with-union-pushdown branch from cad6d23 to 3b133f2 Compare April 16, 2026 22:40
devcrafter and others added 2 commits April 17, 2026 15:20
QueryTreeNodePtr inner_query_tree;
try
{
inner_query_tree = buildQueryTree(inner_query_ast->clone(), context);

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inner_query_tree could be cached to avoid building query tree and resolving it repeatedly during query execution

Comment thread src/Core/Settings.cpp
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@devcrafter devcrafter enabled auto-merge April 18, 2026 23:08
@devcrafter devcrafter disabled auto-merge April 18, 2026 23:09
devcrafter and others added 3 commits April 19, 2026 01:19
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@clickhouse-gh

clickhouse-gh Bot commented Apr 19, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.00% 84.00% +0.00%
Functions 91.10% 91.10% +0.00%
Branches 76.50% 76.50% +0.00%

Changed lines: 89.97% (538/598) | lost baseline coverage: 18 line(s) · Uncovered code

Full report · Diff report

@devcrafter devcrafter added this pull request to the merge queue Apr 19, 2026
Merged via the queue into master with commit 3a0767f Apr 19, 2026
159 of 161 checks passed
@devcrafter devcrafter deleted the pr-view-with-union-pushdown branch April 19, 2026 20:33
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Apr 19, 2026
pull Bot pushed a commit to AnotherGenZ/ClickHouse that referenced this pull request Apr 30, 2026
`test_parallel_replicas_with_view` (added in ClickHouse#100958, merged 2026-04-19)
asserts strict equality on the `ParallelReplicasUnavailableCount` and
`ParallelReplicasUsedCount` ProfileEvents collected from `system.query_log`.
The strict equality on `ParallelReplicasUnavailableCount` is racy.

Old replicas are filtered at connection time inside `RemoteQueryExecutor`'s
`create_connections` lambda: the version check disconnects entries whose
`parallel_replicas_version` is below
`DBMS_PARALLEL_REPLICAS_MIN_VERSION_WITH_STREAM_ID`. The
`ParallelReplicasUnavailableCount` ProfileEvent is only incremented later in
`RemoteQueryExecutor::sendQueryUnlocked`, when it observes the resulting empty
`MultiplexedConnections`. With `parallel_replicas_local_plan = 1` (the default)
the local plan can serve all required data quickly and cancel the pipeline,
in which case the `RemoteSource` for some old replicas may be cancelled
BEFORE `tryGenerate` ever calls `sendQuery` (see `RemoteSource::prepare`
short-circuit on `isCancelled`). For those replicas the version check never
runs and `ParallelReplicasUnavailableCount` is not incremented.

CIDB confirms the racy pattern. Last 14 days on master + PRs:
* 2026-04-29 PR ClickHouse#100412 — line 129 `parallel_replicas_allow_view_over_mergetree=0`,
  expected 4,2 got 0,2
* 2026-04-29 PR ClickHouse#100399 — line 129, expected 4,2 got 0,2
* 2026-04-27 master       — line 129, expected 4,2 got 1,2
* 2026-04-27 PR ClickHouse#102197   — line 125 `parallel_replicas_local_plan=1`, expected 2,1 got 0,1

In every case the sum result was correct and the `ParallelReplicasUsedCount`
matched the expected value — only the `ParallelReplicasUnavailableCount`
was below the strict-equality target.

This commit relaxes the assertion to:
* `ParallelReplicasUsedCount == expected_used` (kept strict — this is
  deterministic, and a higher-than-expected value would mean an old replica
  was actually used, which is the real bug we want to catch).
* `0 <= ParallelReplicasUnavailableCount <= expected_unavailable` — bounded
  above by the maximum possible (catches spurious unavailability) but
  tolerant of the cancel-before-`sendQuery` race.

The primary correctness check — `SELECT sum(value) FROM v` matches the
expected sum — is unchanged. If any old replica were ever actually used as
a replica, the sum would be wrong (or `ParallelReplicasUsedCount` would
exceed `expected_used`).

Refs:
- Test added in ClickHouse#100958
- Reports on which the analysis is based:
  * https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100412&sha=614dbdeb3cb9742d3f48e0f9bdb3b37a7146890e&name_0=PR&name_1=Integration%20tests%20%28arm_binary%2C%20distributed%20plan%2C%203%2F4%29
  * https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=529f4ede&name_0=MasterCI

Verified locally: the modified test passes (1 passed in 43.90s) under
`Integration tests (amd_tsan, 1/6)`.
pull Bot pushed a commit to Mu-L/ClickHouse that referenced this pull request Jun 15, 2026
… parent DB is missing

`DataLakeConfiguration::getCatalog` (introduced by ClickHouse#100334) looked up the parent
database in `DatabaseCatalog` and threw `LOGICAL_ERROR` ("Database X not found")
when `tryGetDatabase` returned `nullptr`. That assertion is wrong: a missing
database here is a transient runtime state, not a logical-invariant violation.

Concretely it can fire during async metadata loading after a server restart
(`AsyncLoader::worker` -> `DatabaseOrdinary::loadTableFromMetadata` ->
`createStorageObjectStorage` -> `getCatalog`) when an unrelated table-load
job in the same database has just thrown (for instance because of
`cannot_allocate_thread_fault_injection_probability`) and the database has
been detached as a result. Stress tests with thread-allocation fault
injection have been hitting this LOGICAL_ERROR sporadically: `STID 2377-2a78`,
3 distinct unrelated PRs over 90 days (PR ClickHouse#98472 on 2026-04-09, PR ClickHouse#100958
on 2026-04-12, PR ClickHouse#102804 on 2026-04-30 - none of which touch this code or
its callers).

Production stack from PR ClickHouse#102804 stress-test (amd_debug):

```
2026.04.30 05:17:39.895829 [ 6955 ] AsyncLoader::worker: Code: 439.
DB::Exception: Cannot schedule a task: fault injected (...): Cannot attach
table `test_1`.`test_max_size_drop` from metadata file ...
2026.04.30 05:17:40.099425 [ 6998 ] {} <Fatal> : Logical error:
'Database test_1 not found'.
[stack: DataLakeConfiguration::getCatalog -> createStorageObjectStorage
        -> registerStorageIceberg -> StorageFactory::get
        -> createTableFromAST -> DatabaseOrdinary::loadTableFromMetadata
        -> AsyncLoader::worker]
```

Fix: combine the two null-checks. `dynamic_pointer_cast` already returns
`nullptr` for a null input, so the function naturally returns `nullptr`
both for "DB not registered" and "DB is not `DataLakeCatalog`" - the same
response either way. This matches the behaviour of `getCatalog` before
ClickHouse#100334, restores backward compatibility for `Iceberg` engine tables
hosted in regular `Atomic`/`Ordinary` databases, and removes the
spurious LOGICAL_ERROR signal from stress-test reports without changing
behaviour for the supported `DataLakeCatalog` -> `Iceberg` path.

Local verification (debug build):
- Compiles, server starts.
- `CREATE TABLE iceberg_t ENGINE = IcebergLocal(...)` inside a regular
  `Atomic` database succeeds, DETACH/ATTACH database cycle succeeds,
  server restart with `async_load_databases=1` reloads the table without
  LOGICAL_ERROR.

Report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=102804&sha=40e4eba7d14b8588106464e81b911e8de7a45dc6&name_0=PR&name_1=Stress%20test%20%28amd_debug%29

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-performance Pull request with some performance improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants