iframe-proxy

devcrafter · 2024-10-24T14:24:19Z

Changelog category (leave one):

Not for changelog (changelog entry is not required)

Details

The setting enables code execution, which can trigger hidden bugs, in particular in GLOBAL JOINs with parallel replicas. Discovered one while doing #70658 within 02967_parallel_replicas_joins_and_analyzer test

clickhouse-gh · 2024-12-31T13:08:31Z

clickhouse-gh · 2025-02-11T13:13:21Z

Dear @antaljanosbenjamin, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself.

clickhouse-gh · 2025-07-27T04:10:07Z

Workflow [PR], commit [62da6d9]

Summary: ❌

job_name	test_name	status	info
Stateless tests (arm_asan_ubsan, targeted)		FAIL
	03452_array_join_global_right_join_parallel_replicas	FAIL	cidb
	01344_min_bytes_to_use_mmap_io_index	FAIL	cidb
	03279_pr_3_way_joins_right_first	FAIL	cidb
	03560_parallel_replicas_projection	FAIL	cidb
	03452_array_join_global_right_join_parallel_replicas	FAIL	cidb
	04071_global_in_dia_no_explicit_set_elements	FAIL	cidb
	04052_distributed_index_analysis_in_subquery_no_quadratic	FAIL	cidb
	03560_parallel_replicas_projection	FAIL	cidb
	03801_autopr_input_bytes_estimation_query_with_subqueries	FAIL	cidb
	Too many test failures	FAIL	cidb
Stateless tests (amd_asan_ubsan, distributed plan, parallel, 1/2)		FAIL
	02915_input_table_function_in_subquery	FAIL	cidb
	03031_filter_float64_logical_error	FAIL	cidb
Stateless tests (amd_asan_ubsan, distributed plan, parallel, 2/2)		FAIL
	03231_pr_duplicate_announcement	FAIL	cidb
	03254_pr_join_on_dups	FAIL	cidb
Stateless tests (amd_debug, parallel)		FAIL
	04266_text_index_tokens_cardinality_order	FAIL	cidb
	01169_old_alter_partition_isolation_stress	FAIL	cidb
	04052_distributed_index_analysis_in_subquery_no_quadratic	FAIL	cidb
	03275_pr_any_join	FAIL	cidb
Stateless tests (amd_tsan, parallel, 1/2)		FAIL
	00945_bloom_filter_index	FAIL	cidb
	03275_pr_any_join	FAIL	cidb
	03560_parallel_replicas_projection	FAIL	cidb
Stateless tests (amd_tsan, parallel, 2/2)		FAIL
	01168_mutations_isolation	FAIL	cidb
	04071_global_in_dia_no_explicit_set_elements	FAIL	cidb
	03800_autopr_reuse_index_analysis	FAIL	cidb
	01169_old_alter_partition_isolation_stress	FAIL	cidb
Stateless tests (arm_binary, parallel)		FAIL
	03560_parallel_replicas_projection	FAIL	cidb
	03275_pr_any_join	FAIL	cidb
	02731_parallel_replicas_join_subquery	FAIL	cidb
	01585_use_index_for_global_in_with_null	FAIL	cidb
	03261_pr_semi_anti_join	FAIL	cidb
	03279_pr_3_way_joins_right_first	FAIL	cidb
	01169_alter_partition_isolation_stress	FAIL	cidb
	04052_distributed_index_analysis_in_subquery_no_quadratic	FAIL	cidb
	03031_filter_float64_logical_error	FAIL	cidb
	01171_mv_select_insert_isolation_long	FAIL	cidb
Fast test (arm_darwin)		DROPPED
Build (arm_release)		DROPPED
Build (arm_darwin)		DROPPED

AI Review

Summary

This PR still makes the stateless randomized test harness globally generate parallel_replicas_min_number_of_rows_per_replica = 1. In the current code that value still combines with automatic_parallel_replicas_mode = 2 to force real parallel-replicas execution in normal randomized CI, so the change remains blocked on proving that path is green for the full suite or on gating it away from normal runs.

Missing context / blind spots

⚠️ The latest Praktika PR report for commit 62da6d95 is still pending/empty on July 4, 2026, so there is no fresh full-suite evidence after the latest master merge. A completed randomized PR run on this commit (or a descendant with the same behavior) would close that gap.

Findings

❌ Blockers

[dismissed by author -- https://github.com/Randomize parallel_replicas_min_number_of_rows_per_replica #71028#discussion_r3382610973] tests/clickhouse-test:1470, tests/clickhouse-test:1528, tests/clickhouse-test:1605-1618 — The randomized-test contract is still violated: adjust_settings_for_autopr forces enable_parallel_replicas, cluster_for_parallel_replicas, and parallel_replicas_local_plan whenever automatic_parallel_replicas_mode randomizes to 2, and this PR now also feeds a nonzero parallel_replicas_min_number_of_rows_per_replica into that path. That means ordinary randomized stateless runs still exercise the real parallel-replicas planner globally, but the branch has no green PR report for commit 62da6d95, and the latest July 4, 2026 triage still expects unresolved wrong-result / WITH ROLLUP / index-analysis / max_rows_to_read failures from exactly this combination. I still consider this real because the code path is unchanged and the required proof of suite-green behavior is still missing.
Suggested fix: keep this generator at 0 for normal randomized CI, or gate it behind a dedicated opt-in / targeted exclusions until the remaining parallel-replicas correctness gaps are fixed and a full randomized PR run is green.

Final Verdict

Status: ⚠️ Request changes
Minimum required actions: either gate this randomization away from normal randomized CI, or land the remaining parallel-replicas fixes and show a green randomized PR report for this behavior before merge.

alexey-milovidov · 2025-07-28T03:59:06Z

@devcrafter, it shows errors.

clickhouse-gh · 2025-09-02T13:20:23Z

Dear @alexey-milovidov, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself.

alexey-milovidov

A very good change!

alexey-milovidov · 2025-10-18T01:16:07Z

@devcrafter, it failed.

clickhouse-gh · 2025-11-18T13:21:08Z

Dear @alexey-milovidov, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself.

alexey-milovidov · 2025-12-22T11:14:49Z

@devcrafter, more changes are needed.

clickhouse-gh · 2026-06-01T15:33:30Z

Dear @alexey-milovidov, you haven't been active on this PR for 30 days. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself.

…o-read-setting # Conflicts: # tests/clickhouse-test

alexey-milovidov · 2026-06-04T06:23:28Z

Merged current master into the branch (it was ~16k commits / 2 months behind, last updated 2026-03-28). The only conflict was a trivial one in tests/clickhouse-test (master fixed the "harmfull"→"harmful" typo on the adjacent line). Net change vs. master is unchanged in intent — the single randomizer line:

"parallel_replicas_min_number_of_rows_per_replica": lambda: random.randint(0, 1),

Pushed ace3dcab56f..28db0b7a689 to refresh CI on current master.

Note on the prior (2026-03-28) CI failures: I spot-checked two of them against a fresh master (26.6.1.1) and they still reproduce, so they are not stale:

02155_read_in_order_max_rows_to_read — with enable_parallel_replicas=1 + parallel_replicas_min_number_of_rows_per_replica=1, SELECT a FROM t_max_rows_to_read ORDER BY a LIMIT 5 SETTINGS max_rows_to_read = 12 throws TOO_MANY_ROWS (reads all 100 rows). The read-in-order + LIMIT row-count bound isn't applied on the parallel-replicas path.
03261_pr_semi_anti_join — the semi right / anti results are different data, not just reordered, when parallel_replicas_min_number_of_rows_per_replica=1 changes the number of replicas used. Looks like a real correctness issue in semi/anti joins on the parallel-replicas path.

The randomizer line behaves as intended: a value of 1 forces real parallel-replica execution (overriding the statistics-only behavior of automatic_parallel_replicas_mode=2), which is exactly what surfaces these latent bugs. I deliberately did not tag/blacklist the affected tests, since that would hide the very bugs this change is meant to expose. The remaining failures look like genuine product issues for the parallel-replicas path rather than something to fix in this PR — flagging for your call on sequencing (fix-then-randomize vs. land-and-track).

…o-read-setting

alexey-milovidov · 2026-06-06T14:58:45Z

Merged master (the branch was only ~2 days behind, but red) to get a fresh CI signal on today's master.

The remaining CI red is not caused by the one-line diff itself — it is the parallel-replicas bugs this PR is meant to surface. The CI randomized-settings diagnosis points at parallel_replicas_min_number_of_rows_per_replica 1 (frequently together with the already-randomized automatic_parallel_replicas_mode 2). The failures split into two kinds:

A. Genuine server bugs (block merge, outside the scope of this test-only change):

03031_filter_float64_logical_error — WITH ROLLUP over an empty filtered set loses the totals row with parallel replicas (one 0\t7 row is dropped).
03279_pr_3_way_joins_right_first — a 3-way RIGHT … INNER JOIN returns different results with enable_parallel_replicas = 1 vs 0 (the test EXCEPTs the two and expects them equal).
considerEnablingParallelReplicas.cpp:359 — chassert(local_replica_plan_reading_step->getAnalyzedResult() == nullptr) fires → server abort ("Server died") in the debug build when the manual min-rows path and the automatic-parallel-replicas optimizer both apply to the same plan.
Several GLOBAL JOIN / array-join / projection cases in the tsan/arm reports (02731_parallel_replicas_join_subquery, 03452_array_join_global_right_join_parallel_replicas, 03560_parallel_replicas_projection, 04071_global_in_dia_no_explicit_set_elements, 01585_use_index_for_global_in, …) — exactly the class described in the PR motivation.

B. Single-node "measurement" tests that just need parallel replicas pinned off (contained test fixes):

04051_pk_analysis_stats and 04052_distributed_index_analysis_in_subquery_no_quadratic assert mark / index-analysis accounting that only holds for single-node reads; with parallel replicas the work is split across replicas and the accounting differs.

The (B) tests can be made robust by pinning enable_parallel_replicas = 0 in them, but that alone will not turn CI green — the (A) correctness/stability bugs remain and need fixing in the parallel-replicas code itself before this randomization can be enabled. Leaving those for the parallel-replicas owners since they are the purpose of this bug-finding PR rather than something to mask here.

…o-read-setting

alexey-milovidov · 2026-06-08T08:02:06Z

Re-merged current master into the branch (it was 392 commits behind, last pushed 2026-06-06) to refresh the CI signal on today's master. The diff vs. master is still the single randomizer line — no functional change.

I re-triaged the latest CI red and it matches the prior categorization; nothing new is regressed by this PR, the failures are exactly the latent parallel-replicas bugs that parallel_replicas_min_number_of_rows_per_replica 1 (forcing real parallel-replica execution) is meant to surface:

Genuine server bugs (block merge, owned by the parallel-replicas team, out of scope for this test-only change):

03031_filter_float64_logical_error — WITH ROLLUP over an empty filtered set loses the totals row (one 0\t7 row dropped). Culprit minimized to --parallel_replicas_min_number_of_rows_per_replica 1.
03279_pr_3_way_joins_right_first, 03275_pr_any_join, 03254_pr_join_on_dups — different results with parallel replicas (correctness in semi/anti/any/right joins on the PR path).
considerEnablingParallelReplicas.cpp:359 chassert(local_replica_plan_reading_step->getAnalyzedResult() == nullptr) → server abort in debug/tsan ("Server died").
03560_parallel_replicas_projection, 03452_array_join_global_right_join_parallel_replicas — the GLOBAL JOIN / array-join / projection class described in the PR motivation.
The transaction-stress tests (01169/01171/01173/01174) fail as a cascade of the server aborts above.

Index-analysis / "measurement" tests — divergence is itself likely a real behavioral gap, so NOT safe to pin:

04052_distributed_index_analysis_in_subquery_no_quadratic — "Expected 3-4 queries, got 5". With parallel_replicas_min_number_of_rows_per_replica 1 the nested index-analysis subquery (mergeTreeAnalyzeIndexesUUID(..., in(key, (SELECT ...)))) itself goes distributed, which distributed_index_analysis_only_on_coordinator is supposed to suppress. This looks like the coordinator-only restriction not fully disabling parallel replicas for nested index analysis — i.e. a real finding, not a test artifact.
03800_autopr_reuse_index_analysis, 03801_autopr_input_bytes_estimation_query_with_subqueries — same family; forcing the PR path perturbs the index-analysis-reuse / byte-estimation assertions.

Deliberately did not tag/blacklist or pin parallel replicas off in any of these — pinning the index-analysis tests would mask exactly the kind of gap this PR is designed to expose. None of the remaining red is fixable inside this one-line test change; it needs the underlying parallel-replicas bugs fixed first. Flagging for your sequencing call (fix-then-randomize vs. land-and-track) — the failures are product bugs for the parallel-replicas owners rather than something to mask here.

…o-read-setting

clickhouse-gh · 2026-06-09T17:26:46Z

        "max_parsing_threads": lambda: random.choice([0, 1, 10]),
        "optimize_functions_to_subcolumns": lambda: random.randint(0, 1),
        "parallel_replicas_local_plan": lambda: random.randint(0, 1),
+        "parallel_replicas_min_number_of_rows_per_replica": lambda: random.randint(0, 1),


Randomizing this globally violates the randomized-test invariant that generated settings must still keep the stateless suite green. With automatic_parallel_replicas_mode = 2, adjust_settings_for_autopr forces enable_parallel_replicas, cluster_for_parallel_replicas, and parallel_replicas_local_plan; when this new setting is 1, the analyzer still builds the real parallel-replicas plan because buildQueryPlanForAutomaticParallelReplicas keeps this setting and only clears automatic_parallel_replicas_mode inside the candidate plan. The PR discussion already lists failures from that combination, including wrong results and the considerEnablingParallelReplicas assertion. Please keep this at 0 until those parallel-replica bugs are fixed, or gate the randomization behind a dedicated opt-in or affected-test exclusions so normal randomized CI stays green.

Acknowledged — this is exactly the open sequencing question discussed in the PR-level comments (fix-then-randomize vs. land-and-track). The red CI here is deliberate: the failures are the latent parallel-replicas bugs this randomization is designed to surface (see the triage comments for the categorized list, e.g. lost WITH ROLLUP totals, wrong semi/anti/any join results, the considerEnablingParallelReplicas.cpp:359 assertion). Gating or excluding the affected tests would mask the very signal the PR exists to produce, so the deliberate choice so far has been not to do that. The PR will not be merged while these product bugs are unfixed — it stays open as the tracking/bug-finding vehicle until the parallel-replicas owners fix them, or until we decide to gate it.

alexey-milovidov · 2026-06-11T11:26:54Z

@groeneai, investigate the failure: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=71028&sha=849e3ff013f70628c8daad3a4fda355753bc580b&name_0=PR&name_1=Stateless%20tests%20%28arm_binary%2C%20parallel%29 and provide a fix in a separate PR. If the fix is already in progress, link it here.

The failure is Logical error: Expected 3 to 10 arguments in table function azureBlobStorage, got 1 (STID: 3574-4812, also 2508-4994 in the amd_asan_ubsan run). It is not caused by this PR: CIDB shows it failing across dozens of unrelated PRs and on master itself every day for at least the last week. The stack goes through buildQueryPlanForAutomaticParallelReplicas → considerEnablingParallelReplicas, i.e. the already-default-randomized automatic_parallel_replicas_mode path — it looks like table-function argument rewriting (likely secret masking) corrupts the azureBlobStorage AST when the automatic-parallel-replicas candidate plan is built. I did not find an open issue tracking it.

alexey-milovidov · 2026-06-11T11:27:12Z

Re-triaged the latest CI red (commit 849e3ff, run of 2026-06-09). No re-merge of master this time: the branch is only 2 days behind, and none of the underlying parallel-replicas bugs have been fixed on master since (no changes to considerEnablingParallelReplicas.cpp or the related planner code), so a refresh would reproduce the same red. The diff vs. master is still the single randomizer line.

The failures match the prior categorization, with three new members of the same classes (all reproduce only with parallel_replicas_min_number_of_rows_per_replica 1 in the randomized settings):

00945_bloom_filter_index — TOO_MANY_ROWS (max rows: 7, current rows: 8): the max_rows_to_read bound is not honored on the parallel-replicas path. Same class as the previously reported 02155_read_in_order_max_rows_to_read.
04065_direct_join_column_const_mismatch — NOT_IMPLEMENTED "Can't execute any of specified algorithms for specified strictness/kind and right storage type" from JoinStepLogicalLookup::optimize when parallel replicas are forced onto a direct join.
04098_asterisk_include_virtual_columns_mergetree — NOT_FOUND_COLUMN_IN_BLOCK for _part: already tracked in NOT_FOUND_COLUMN_IN_BLOCK with asterisk_include_virtual_columns and parallel replicas #106561 (asterisk_include_virtual_columns + parallel replicas).

Independent confirmation that these are latent product bugs rather than an artifact of this diff: #106734 ("More settings to randomize") hits the exact same set of tests (03031, 03275_pr_any_join, 02155, 04065, 04098, 01171, 03801) in its CI.

Still present from the previous rounds: 03031_filter_float64_logical_error (lost WITH ROLLUP totals row), 03275_pr_any_join (wrong results), 04052 / 03801 (index-analysis divergence — likely a real behavioral gap, not pinned on purpose), the Server died aborts (consistent with the considerEnablingParallelReplicas.cpp:359 chassert), and the 01171 transaction-test cascade.

One failure is unrelated to this PR — the azureBlobStorage "Expected 3 to 10 arguments" logical error fails across many PRs and on master daily; pinged @groeneai above.

Status unchanged: the remaining red is the latent parallel-replicas bugs this PR is designed to surface, and the sequencing call (fix-then-randomize vs. land-and-track) is still open.

groeneai · 2026-06-11T13:45:58Z

Investigated. This Logical error: Expected 3 to 10 arguments in table function azureBlobStorage, got 1 (STID 3574-4812 / 2508-4994) is a CI scanner false positive, not a server bug, and it is not caused by this PR. Fix in a separate PR: #107220.

Root cause: the stateless test 04119_object_storage_bad_arg_count_no_logical_error (added by #103544, merged 2026-06-09) is a regression test whose queries throw on purpose. On a thrown query, executeQuery logs the full query text at Error level via toOneLineQuery, which keeps comment tokens verbatim, so the test's leading comment lands in clickhouse-server.err.log. That comment quoted the original pre-fix crash message word for word. ci/jobs/scripts/log_parser.py greps the server log with rg --text -A 10 -o 'Logical error.*', matches the comment, takes its first line as the failure title and a nearby unrelated stack for the STID (hence the buildQueryPlanForAutomaticParallelReplicas stack you saw, which has nothing to do with azureBlobStorage arg parsing).

It started exactly on 2026-06-09 because #103544 made the server throw NUMBER_OF_ARGUMENTS_DOESNT_MATCH instead of LOGICAL_ERROR, so the comment became the only remaining source of the string, on master and across every PR that runs the test. So your secret-masking hypothesis isn't it; the AST is fine. (log_parser.py already documents this SQL-comment false-match class in a comment near the grep.)

Reproduced locally: feeding the comment + SELECT * FROM azureBlobStorage();, the query correctly throws Code 42 NUMBER_OF_ARGUMENTS_DOESNT_MATCH, yet the scanner grep matches the comment. After rewording the comment to drop the verbatim message, the grep no longer matches and the test still passes. #107220 is test-only.

…o-read-setting

alexey-milovidov · 2026-06-13T18:35:24Z

Re-merged current master into the branch (it was 1333 commits behind, last pushed 2026-06-09 / red). The diff vs. master is still the single randomizer line — no functional change.

The reason to re-merge now (vs. the 2026-06-11/12 triage, where re-merging would have just re-rolled the same red): the parallel-replicas owners have started landing fixes for the bugs this randomizer surfaces. Two directly relevant commits authored by @devcrafter are now on master and in this branch:

39d95a60cea — Test + fix for parallel_replicas_min_number_of_rows_per_replica > 0 (the exact setting this PR randomizes), plus regression test 04327_pr_view_union_empty_branch.
817d10bbadd — Fix: PR over view with UNION with empty table.

Re-running CI on today's master will refresh the signal and show which of the previously-red tests these fixes have cleared, narrowing the remaining set of genuine product bugs that still block merge. The PR remains intentionally open as the bug-finding / tracking vehicle until the remaining parallel-replicas bugs are fixed or a sequencing decision is made — no pinning/blacklisting.

…o-read-setting

alexey-milovidov · 2026-06-20T09:30:08Z

Re-merged current master into the branch (it was 2374 commits behind, last pushed 2026-06-13 / red). The diff vs. master is still the single randomizer line — no functional change.

The reason to re-merge now (vs. the 2026-06-14..18 triage, where re-merging would have just re-rolled the same red): another directly relevant parallel-replicas JOIN fix has landed on master and is now in this branch:

#106338 — Restrict parallel-replicas join absorption to the leftmost leaf (merged 2026-06-19). It reworks the parallel-replicas + JOIN planner contract in PlannerJoinTree.cpp (the same file as @devcrafter's earlier fixes) so the leftmost leaf explicitly drives join absorption, fixing the empty-left-leaf case that produced LOGICAL_ERROR: Left and right columns have same names.

Re-running CI on today's master will refresh the signal and show which of the previously-red join tests (03254_pr_join_on_dups, 03261_pr_semi_anti_join, 03275_pr_any_join, …) this fix has cleared, narrowing the remaining set of genuine product bugs that still block merge.

The server-abort path (considerEnablingParallelReplicas.cpp chassert(local_replica_plan_reading_step->getAnalyzedResult() == nullptr) → "Server died") is unchanged on master, so that one and the index-analysis / WITH ROLLUP divergences are expected to remain red.

The PR remains intentionally open as the bug-finding / tracking vehicle until the remaining parallel-replicas bugs are fixed or a sequencing decision is made — no pinning/blacklisting.

…o-read-setting The branch was ~11.7k commits behind (last merge 2026-06-20) and red. The diff vs. `master` is still the single randomizer line — no functional change: "parallel_replicas_min_number_of_rows_per_replica": lambda: random.randint(0, 1), Reason to re-merge now: the parallel-replicas owners have landed several directly relevant fixes since 2026-06-20, so refreshing CI on today's `master` will narrow the remaining set of genuine product bugs this PR is meant to surface: - #108451 (`Fix NOT_FOUND_COLUMN_IN_BLOCK for virtual columns under parallel replicas`, closes #106561) — should clear the tracked `04098_asterisk_include_virtual_columns_mergetree` failure. - #101434 (`Reimplement reading in order for parallel replicas`) — bears directly on the `max_rows_to_read`-not-honored class (`02155_read_in_order_max_rows_to_read`, `00945_bloom_filter_index`). - #109003 (`Fix server abort on GROUPING SETS in a set operation with parallel replicas`) — a "Server died" class fix. - Flaky-test fix for `04051_pk_analysis_stats`. Conflicts were all in files the branch does not intentionally change (its only intended change is the one `tests/clickhouse-test` line); they were resolved by taking `master`'s version. No pinning/blacklisting of the affected parallel-replicas tests — that would mask the very signal this PR exists to produce. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

alexey-milovidov · 2026-07-02T23:46:23Z

Re-merged current master into the branch (it was ~11.7k commits behind, last pushed 2026-06-20 / red). The diff vs. master is still the single randomizer line — no functional change.

Reason to re-merge now (vs. just re-rolling the same red): the parallel-replicas owners have landed several directly relevant fixes since 2026-06-20, so refreshing CI on today's master will narrow the remaining set of genuine product bugs that still block merge:

#108451 — Fix NOT_FOUND_COLUMN_IN_BLOCK for virtual columns under parallel replicas (Closes #106561). This is the exact failure previously triaged as 04098_asterisk_include_virtual_columns_mergetree (_part virtual column), so that one should now clear.
#101434 — Reimplement reading in order for parallel replicas (nickitat). Directly reworks the read-in-order + parallel-replicas path, i.e. the max_rows_to_read-not-honored class previously reported as 02155_read_in_order_max_rows_to_read and 00945_bloom_filter_index (TOO_MANY_ROWS) — the refreshed run will show whether the row-count bound is now applied.
#109003 — Fix server abort on GROUPING SETS in a set operation with parallel replicas. Clears that "Server died" class.
Flaky-test fix for 04051_pk_analysis_stats.

Expected to remain red (no relevant change on master):

The considerEnablingParallelReplicas.cpp:359 chassert(local_replica_plan_reading_step->getAnalyzedResult() == nullptr) → "Server died" path is unchanged on master (the assertion is still there), so that abort and its transaction-stress cascade (01169/01171/01173/01174) should persist.
03031_filter_float64_logical_error (lost WITH ROLLUP totals row) and the semi/anti/any/right-join correctness cases (03254_pr_join_on_dups, 03261_pr_semi_anti_join, 03275_pr_any_join, 03279_pr_3_way_joins_right_first).
The index-analysis divergences (04052_distributed_index_analysis_in_subquery_no_quadratic, 03801_autopr_input_bytes_estimation_query_with_subqueries) — likely a real behavioral gap, deliberately not pinned.

As before, I did not tag/blacklist or pin parallel replicas off in any test — pinning would mask exactly the signal this PR is designed to expose. The PR remains intentionally open as the bug-finding / tracking vehicle until the remaining parallel-replicas bugs are fixed or a sequencing decision is made.

…rows-to-read-setting

alexey-milovidov · 2026-07-04T21:29:46Z

Re-merged current master into the branch (it was ~638 commits behind, last pushed 2026-07-02 / red). The diff vs. master is still the single randomizer line — no functional change.

Reason to re-merge now (vs. just re-rolling the same red): the exact server-abort path I flagged in the 2026-07-02 triage as "unchanged on master (the assertion is still there)" has now been fixed on master:

Fix logical error in automatic parallel replicas with min_number_of_rows_per_replica #109011 — Fix logical error in automatic parallel replicas with parallel_replicas_min_number_of_rows_per_replica (merged 2026-07-04, authored by @groeneai; Related: #71028). It removes the considerEnablingParallelReplicas.cpp chassert(local_replica_plan_reading_step->getAnalyzedResult() == nullptr) abort. The assertion only held for parallel_replicas_min_number_of_rows_per_replica == 0; with the value 1 that this randomizer forces (together with the already-randomized automatic_parallel_replicas_mode), canUseParallelReplicasOnInitiator() in PlannerJoinTree.cpp runs index analysis on the reading step to estimate the replica count, so that step already carries an analyzed result when it is transplanted into the parallel-replicas plan → abort. This is exactly the class this PR was surfacing.

Refreshing CI on today's master should therefore clear the "Server died" aborts on the considerEnablingParallelReplicas.cpp path and their transaction-stress cascade (01169/01171/01173/01174).

Expected to remain red (no relevant change on master):

The semi/anti/any/right-join correctness cases (03254_pr_join_on_dups, 03261_pr_semi_anti_join, 03275_pr_any_join, 03279_pr_3_way_joins_right_first) and the GLOBAL JOIN / array-join class (03452_array_join_global_right_join_parallel_replicas, 04071_global_in_dia_no_explicit_set_elements) — these are gated on Support JOIN queries with AutoPR #106073 (Support JOIN queries with AutoPR), still open.
03031_filter_float64_logical_error (lost WITH ROLLUP totals row).
The index-analysis / estimation divergences (04052_distributed_index_analysis_in_subquery_no_quadratic, 03801_autopr_input_bytes_estimation_query_...) and the exact max_rows_to_read accounting cases (00945_bloom_filter_index, 01585_use_index_for_global_in, 02155_read_in_order_max_rows_to_read).

The PR remains intentionally open as the bug-finding / tracking vehicle until the remaining parallel-replicas bugs are fixed or a sequencing decision is made — no pinning/blacklisting.

Randomize parallel_replicas_min_number_of_rows_per_replica

f08f229

robot-ch-test-poll4 added the pr-not-for-changelog This PR should not be mentioned in the changelog label Oct 24, 2024

Automatic style fix

4eefa53

antaljanosbenjamin self-assigned this Oct 25, 2024

antaljanosbenjamin reviewed Oct 25, 2024

View reviewed changes

Comment thread tests/clickhouse-test Outdated

antaljanosbenjamin approved these changes Oct 28, 2024

View reviewed changes

Merge branch 'master' into pr-randomize-rows-to-read-setting

510154e

clickhouse-gh Bot unassigned antaljanosbenjamin Dec 31, 2024

antaljanosbenjamin self-assigned this Jan 6, 2025

devcrafter added 2 commits January 7, 2025 22:21

Merge branch 'master' into pr-randomize-rows-to-read-setting

18c86dc

Merge branch 'master' into pr-randomize-rows-to-read-setting

fbcb53e

clickhouse-gh Bot unassigned antaljanosbenjamin Feb 11, 2025

alexey-milovidov approved these changes Jul 27, 2025

View reviewed changes

Merge branch 'master' into pr-randomize-rows-to-read-setting

639d43d

alexey-milovidov self-assigned this Jul 27, 2025

Merge branch 'master' into pr-randomize-rows-to-read-setting

8c20de0

clickhouse-gh Bot unassigned alexey-milovidov Sep 2, 2025

Merge branch 'master' into pr-randomize-rows-to-read-setting

8956bce

alexey-milovidov self-assigned this Oct 17, 2025

alexey-milovidov approved these changes Oct 17, 2025

View reviewed changes

Merge branch 'master' into pr-randomize-rows-to-read-setting

76032c8

ClickHouse deleted a comment from robot-clickhouse-ci-2 Dec 22, 2025

devcrafter and others added 2 commits January 7, 2026 12:26

Merge branch 'master' into pr-randomize-rows-to-read-setting

07c3fe0

Merge branch 'master' into pr-randomize-rows-to-read-setting

ace3dca

clickhouse-gh Bot assigned antaljanosbenjamin and unassigned alexey-milovidov Jun 1, 2026

Merge remote-tracking branch 'origin/master' into pr-randomize-rows-t…

28db0b7

…o-read-setting # Conflicts: # tests/clickhouse-test

Merge remote-tracking branch 'origin/master' into pr-randomize-rows-t…

d7b4a63

…o-read-setting

Merge remote-tracking branch 'origin/master' into pr-randomize-rows-t…

ddb8c9b

…o-read-setting

Merge remote-tracking branch 'origin/master' into pr-randomize-rows-t…

849e3ff

…o-read-setting

clickhouse-gh Bot reviewed Jun 9, 2026

View reviewed changes

groeneai mentioned this pull request Jun 11, 2026

Stop test 04119 from tripping the CI logical-error log scanner #107220

Merged

groeneai mentioned this pull request Jun 12, 2026

Fix flaky test 04051_pk_analysis_stats under parallel replicas #107306

Open

Merge remote-tracking branch 'origin/master' into pr-randomize-rows-t…

33236dd

…o-read-setting

Merge remote-tracking branch 'origin/master' into pr-randomize-rows-t…

8e1dbc2

…o-read-setting

groeneai mentioned this pull request Jun 30, 2026

Fix exception in correlated subquery with group_by_use_nulls + ROLLUP/CUBE #100365

Draft

1 task

alexey-milovidov mentioned this pull request Jul 1, 2026

Enable snappy compression in HTTP interface #100752

Open

1 task

groeneai mentioned this pull request Jul 1, 2026

Fix logical error in automatic parallel replicas with min_number_of_rows_per_replica #109011

Merged

Merge remote-tracking branch 'origin-https/master' into pr-randomize-…

62da6d9

…rows-to-read-setting

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

devcrafter commented Oct 24, 2024 • edited by clickhouse-gh Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog category (leave one):

Details

Uh oh!

Uh oh!

clickhouse-gh Bot commented Dec 31, 2024

Uh oh!

clickhouse-gh Bot commented Feb 11, 2025

Uh oh!

clickhouse-gh Bot commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Review

Summary

Missing context / blind spots

Findings

Final Verdict

Uh oh!

alexey-milovidov commented Jul 28, 2025

Uh oh!

clickhouse-gh Bot commented Sep 2, 2025

Uh oh!

alexey-milovidov left a comment

Choose a reason for hiding this comment

Uh oh!

alexey-milovidov commented Oct 18, 2025

Uh oh!

clickhouse-gh Bot commented Nov 18, 2025

Uh oh!

alexey-milovidov commented Dec 22, 2025

Uh oh!

clickhouse-gh Bot commented Jun 1, 2026

Uh oh!

alexey-milovidov commented Jun 4, 2026

Uh oh!

alexey-milovidov commented Jun 6, 2026

Uh oh!

alexey-milovidov commented Jun 8, 2026

Uh oh!

clickhouse-gh Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

alexey-milovidov Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

alexey-milovidov commented Jun 11, 2026

Uh oh!

alexey-milovidov commented Jun 11, 2026

Uh oh!

groeneai commented Jun 11, 2026

Uh oh!

alexey-milovidov commented Jun 13, 2026

Uh oh!

alexey-milovidov commented Jun 20, 2026

Uh oh!

alexey-milovidov commented Jul 2, 2026

Uh oh!

alexey-milovidov commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

devcrafter commented Oct 24, 2024 •

edited by clickhouse-gh Bot

Loading

clickhouse-gh Bot commented Jul 27, 2025 •

edited

Loading