Revert "Apply exact row-positioning for vector search rescoring queries" (#105591) by groeneai · Pull Request #108812 · ClickHouse/ClickHouse · GitHub
Skip to content

Revert "Apply exact row-positioning for vector search rescoring queries" (#105591)#108812

Merged
Algunenano merged 1 commit into
ClickHouse:masterfrom
groeneai:groeneai/revert-105591-vector-rescoring-distance-conflict
Jun 29, 2026
Merged

Revert "Apply exact row-positioning for vector search rescoring queries" (#105591)#108812
Algunenano merged 1 commit into
ClickHouse:masterfrom
groeneai:groeneai/revert-105591-vector-rescoring-distance-conflict

Conversation

@groeneai

@groeneai groeneai commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

...

Description

Reverts the merge of #105591 (commit d56213874eb), reversing changes made to 6519f67a9eb.

#105591 broke 02354_vector_search_rescoring on master across all build configs (amd_debug, amd_asan_ubsan, amd_tsan, amd_llvm_coverage, arm_binary) with a deterministic Code: 44. ILLEGAL_COLUMN ("The _distance column is an internal virtual column of vector search and cannot be referenced directly in queries"). CI randomized-settings diagnosis: "Runs: 350, Failed: 350, Passed: 0. Test also fails without randomized settings (not a randomization issue)."

Report (amd_debug, parallel): https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=d56213874eb9a8630fb1dd4862525efcf564d30f&name_0=MasterCI&name_1=Stateless%20tests%20%28amd_debug%2C%20parallel%29

Root cause is a base-skew semantic conflict: #105591 was developed and CI-tested on a base that predated the _distance guard, then merged onto a master that already contained it.

The guard from #108423 pre-empts #105591's new code path at useVectorSearch.cpp:236, so #105591's own added queries throw ILLEGAL_COLUMN once both changes are on master. Master bisect confirms: 6519f67a9eb (pre-#105591, guard already present) was green for the existing tests; d56213874eb (#105591) is red across all configs.

Reverting #105591 (the later, feature PR) restores master to green while preserving the customer-incident #1654 guard. The product decision of whether _distance should be user-referenceable is left to the vector-search owners to reconcile and re-land #105591 on top of the #108423 guard.

Local verification on the reverted branch (build id 16173c3a04...):

  • 02354_vector_search_rescoring output matches reference exactly
  • 02354_vector_search_rescoring_and_prewhere passes
  • 02354_vector_search_incident1654 guard still fires ILLEGAL_COLUMN for both queries

Note: an earlier version of this description attributed the _distance guard to #107985; that was incorrect. #107985 ("Add ProfileEvents to observe distributed query plan execution") is unrelated. The guard was added by #108423, as pointed out by @Algunenano below.

Version info

  • Merged into: 26.7.1.244

…es" (ClickHouse#105591)

This reverts the merge of PR ClickHouse#105591 (commit d562138), reversing changes
made to 6519f67.

ClickHouse#105591 broke 02354_vector_search_rescoring on master across all build configs
with a deterministic Code 44 ILLEGAL_COLUMN ("The _distance column is an
internal virtual column of vector search and cannot be referenced directly").

Root cause is a base-skew semantic conflict between two PRs merged 65 minutes
apart, neither of which saw the other in CI:
- ClickHouse#107985 (merged 10:05 UTC) added a hard ILLEGAL_COLUMN guard in both passes
  of useVectorSearch.cpp, making _distance internal-only. It closed customer
  incident clickhouse-core-incidents#1654 and added test
  02354_vector_search_incident1654 asserting the guard.
- ClickHouse#105591 (merged 11:10 UTC) added a feature that deliberately supports
  referencing _distance directly, with test queries and reference output that
  expect it to succeed.

The guard from ClickHouse#107985 pre-empts ClickHouse#105591's new code path, so ClickHouse#105591's own
tests fail on master. Reverting ClickHouse#105591 (the later, feature PR) restores master
to green while preserving the customer-incident ClickHouse#1654 guard. The product
decision of whether _distance should be user-referenceable is left to the
vector-search owners to reconcile and re-land.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

groeneai commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

@groeneai

groeneai commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

cc @shankar-iyer @rschu1ze: master is currently red on 02354_vector_search_rescoring across all configs (deterministic Code: 44 ILLEGAL_COLUMN). It's a base-skew conflict: #108423 (merged 2026-06-28) made _distance internal-only to close incident #1654, then #105591 (merged 2026-06-29 11:10 UTC; branch tip last updated 2026-06-26) added a feature + tests that reference _distance directly and expect it to work. #105591's CI ran on a guard-free base and passed; the guard pre-empts the feature, so #105591's own tests fail once both are on master. This reverts #105591 to restore green while preserving the #1654 guard; the feature can be re-landed on top of the guard. Could you confirm the revert is the right call vs. another reconciliation?

[Edit: corrected the guard PR from #107985 to #108423 per Algunenano below; #107985 ("Add ProfileEvents to observe distributed query plan execution") is unrelated.]

@Algunenano Algunenano self-assigned this Jun 29, 2026
@Algunenano Algunenano added the can be tested Allows running workflows for external contributors label Jun 29, 2026
@clickhouse-gh

clickhouse-gh Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [e531152]


AI Review

Summary

This PR cleanly reverts #105591, restoring all changed source, docs, and stateless test files to the pre-feature parent 6519f67a9eb while preserving the _distance direct-reference guard added by #107985. I did not find any new correctness, compatibility, or test-coverage findings in the current diff.

Final Verdict

Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-ci label Jun 29, 2026
@Algunenano

Copy link
Copy Markdown
Member

@groeneai The PR that added the guards was #108423, not #107985 (which is unrelated)

@Algunenano Algunenano left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified locally

@Algunenano Algunenano added this pull request to the merge queue Jun 29, 2026
Merged via the queue into ClickHouse:master with commit 1e41bf3 Jun 29, 2026
46 of 70 checks passed
@groeneai

Copy link
Copy Markdown
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants