Add test for #70356#104551
Conversation
Closes #70356.
Add queries against a Distributed table backed by `test_cluster_two_shards_localhost` to broaden coverage of the `NOT_FOUND_COLUMN_IN_BLOCK` regression test.
…0356` If a previous run aborts before the trailing `DROP TABLE`, a rerun would fail with `TABLE_ALREADY_EXISTS`. Cleanup the table next to the other drops at the start of the test.
…two-shard queries The two-shard distributed queries against `test_cluster_two_shards_localhost` were non-deterministic under randomized session settings on the flaky check. Both shards of that cluster resolve to the same backing `shard_table_70356`, so each shard returns the same row. When the randomizer enables both `optimize_skip_unused_shards` and `optimize_distributed_group_by_sharding_key`, the coordinator-side GROUP BY merge is skipped (the GROUP BY column matches the sharding key `sipHash64(adid)`), and the result becomes two duplicate rows per query instead of one. Pin `optimize_distributed_group_by_sharding_key = 0` in the two-shard queries to keep the coordinator merge stable regardless of randomized session settings. The regression scenario (`NOT_FOUND_COLUMN_IN_BLOCK` under the analyzer) is unaffected because it is a query-analysis bug, not an execution-time deduplication concern. Report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104551&sha=f26cca514c8fff2c932dad6a35f9f031122c8419&name_0=PR PR: #104551
The original repro from #70356 used `GROUP BY 1` under the analyzer without `ORDER BY`, with `FORMAT Null`. The existing variants all add `ORDER BY id`, which changes the tail of the plan and can miss bugs specific to the original no-sort path. Add `FORMAT Null` variants matching the original failure shape for both the single-shard and two-shard distributed tables. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@groeneai, please investigate the following CI failures on this PR (the PR only adds the stateless test
Please provide fixes (or link existing fix PRs if work is already in progress) in separate PRs. |
|
@alexey-milovidov — triage results on the four CI failures (commit 1. 2. 3. 4. Root cause for item 4Crash trace (from Path:
The fuzzer harness (
Cross-PR scope (CIDB, 60 days)
The 2026-05 cluster (4 hits across 4 unrelated PRs in 3 days) matches the new fuzzer-corpus mutation that reaches the Fix plan (separate PR)Make I will open a small follow-up PR with this change plus a regression test ( Tracking on our side: task |
Context::clearCaches threw a LOGICAL_ERROR ("X cache was not created yet.")
on the first null cache pointer it encountered. Production servers always
initialize all caches at startup, so the throws never fired in production -
but the execute_query_fuzzer libFuzzer harness (and any unit-test harness
built around a minimal Context) creates a Context without calling any
set*Cache initializer. When a fuzzed CREATE TABLE failed validateStorage,
the cleanup path (MergeTreeData::dropAllData -> Context::clearCaches)
tripped the assertion and libFuzzer reported a crash with the trace
documented on PR ClickHouse#104551.
The fix replaces each "null cache -> throw" guard with a defensive
"if (cache) cache->clear()" check, matching the pattern already used by
every single-cache clear<X>Cache method on Context (clearUncompressedCache,
clearMarkCache, clearPrimaryIndexCache, and so on - 14 sibling methods,
all defensive). clearCaches was the lone outlier.
Adds a regression test in src/Interpreters/tests/gtest_context_clear_caches.cpp
that copies the global test Context (whose caches are all null because
gtest_global_context.cpp never sets them) and calls clearCaches twice -
both calls must complete without throwing. Verified that the test fails
against the unmodified source with "Logical error: 'Uncompressed cache
was not created yet.'" and passes once the defensive null checks are in
place.
See ClickHouse#104551 (comment)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Add a regression test for
NOT_FOUND_COLUMN_IN_BLOCKon distributed table queries under the analyzer, reported in #70356. The bug no longer reproduces on master.Closes #70356.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
...
Documentation entry for user-facing changes
Version info
26.5.1.646