Cherry pick #108488 to 25.8: Do not cache query condition results for sampled reads by robot-ch-test-poll3 · Pull Request #109146 · ClickHouse/ClickHouse · GitHub
Skip to content

Cherry pick #108488 to 25.8: Do not cache query condition results for sampled reads#109146

Closed
robot-ch-test-poll3 wants to merge 3 commits into
backport/25.8/108488from
cherrypick/25.8/108488
Closed

Cherry pick #108488 to 25.8: Do not cache query condition results for sampled reads#109146
robot-ch-test-poll3 wants to merge 3 commits into
backport/25.8/108488from
cherrypick/25.8/108488

Conversation

@robot-ch-test-poll3

Copy link
Copy Markdown
Contributor

Original pull-request #108488

Do not merge this PR manually

This pull-request is a first step of an automated backporting.
It contains changes similar to calling git cherry-pick locally.
If you intend to continue backporting the changes, then resolve all conflicts if any.
Otherwise, if you do not want to backport them, then just close this pull-request.

The check results does not matter at this step - you can safely ignore them.

Troubleshooting

If the conflicts were resolved in a wrong way

If this cherry-pick PR is completely screwed by a wrong conflicts resolution, and you want to recreate it:

  • delete the pr-cherrypick label from the PR
  • delete this branch from the repository

You also need to check the Original pull-request for pr-backports-created label, and delete if it's presented there

The PR source

The PR is created in the CI job

groeneai and others added 3 commits July 1, 2026 21:21
The query condition cache key encodes only the WHERE/PREWHERE predicate,
not the SAMPLE clause. A `SELECT ... SAMPLE x WHERE cond` query reads only
the marks the sampling key selects, then records that sampling-narrowed
mark mask under the predicate hash. A later non-sampled query with the same
predicate reuses the under-counted mask, skips the marks SAMPLE excluded,
and silently returns too few rows.

Disable the query condition cache (both the index-analysis write and the
runtime PREWHERE/WHERE write) whenever the read uses sampling, mirroring
the existing FINAL / bucket_id guards in the disable-cascade.

The consult side is left unchanged: once writes are blocked during
sampling the cache only ever holds full-predicate masks, and a
predicate-false mark stays predicate-false under any sample subset, so
sampled reads still benefit from previously cached full-scan verdicts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Do not cache query condition results for sampled reads
@robot-ch-test-poll3 robot-ch-test-poll3 added pr-cherrypick Cherry-pick of merge-commit before backporting. Do not use manually - automated use only! do not test disable testing on pull request pr-bugfix Pull request with bugfix, not backported by default labels Jul 2, 2026
@rschu1ze rschu1ze closed this Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do not test disable testing on pull request pr-bugfix Pull request with bugfix, not backported by default pr-cherrypick Cherry-pick of merge-commit before backporting. Do not use manually - automated use only!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants