Add LOOKUP INDEX support for MergeTree by dunda18 · Pull Request #101401 · ClickHouse/ClickHouse · GitHub
Skip to content

Add LOOKUP INDEX support for MergeTree#101401

Open
dunda18 wants to merge 68 commits into
ClickHouse:masterfrom
dunda18:set-and-join-tables-as-indices
Open

Add LOOKUP INDEX support for MergeTree#101401
dunda18 wants to merge 68 commits into
ClickHouse:masterfrom
dunda18:set-and-join-tables-as-indices

Conversation

@dunda18

@dunda18 dunda18 commented Mar 31, 2026

Copy link
Copy Markdown

This change adds table-wide LOOKUP INDEX support for MergeTree tables and implements the Set and Join tables as indices task from #87836. Unlike secondary data-skipping indices, these indices build reusable in-memory lookup structures for repeated key-value style queries against the right-hand side table.

It introduces:

  • LOOKUP INDEX ... TYPE table_set for IN table lookups and trivial key-only subqueries
  • LOOKUP INDEX ... TYPE table_join for direct-compatible equality joins with join_algorithm = 'direct'
  • parser, metadata, ALTER, and ReplicatedMergeTree support
  • planner integration, runtime cache invalidation for both data and metadata changes, stateless tests, performance benchmarks, and minimal user documentation

Current limitations:

  • lookup keys must be plain columns
  • table_join requires non-nullable key columns
  • table_join is only used for direct-compatible equality joins
  • table_join is not always faster than hash; the included benchmarks show win, near-parity, and lose cases depending on the relative sizes of the left and right tables

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Add LOOKUP INDEX support for MergeTree tables with table_set and table_join table-wide lookup indexes.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Note

High Risk
High risk because it adds a new experimental index type that touches SQL parsing/formatting, ALTER/replication metadata, MergeTree read paths and caching, and join/set planner optimizations (including direct join behavior), which can affect query correctness and performance.

Overview
Adds experimental LOOKUP INDEX support to MergeTree-family tables (gated by allow_experimental_lookup_index) with two new index types: table_set to accelerate IN table/trivial key-only subqueries and table_join to accelerate direct-compatible equality joins.

Extends SQL/metadata plumbing end-to-end: new AST fields and parser/formatter support for LOOKUP INDEX (including ALTER TABLE ... ADD/DROP LOOKUP INDEX), metadata storage and replication via ReplicatedMergeTreeTableMetadata, and validation to prevent misuse (e.g., no GRANULARITY, no expressions/args, and non-nullable keys for table_join).

Implements runtime lookup caches in MergeTreeData (with invalidation on metadata/part changes) and integrates planner fast paths for sets (CollectSets) and joins (PlannerJoins/DirectKeyValueJoin) while preserving correctness by disabling the optimization when row policies or unsupported modifiers/parallel replicas are in play. Adds documentation plus new stateless/performance tests for both lookup modes.

Reviewed by Cursor Bugbot for commit 6d94b00. Bugbot is set up for automated code reviews on this repo. Configure here.

@UnamedRus

Copy link
Copy Markdown
Contributor

@dunda18

dunda18 commented Mar 31, 2026

Copy link
Copy Markdown
Author

No, ANY/ALL is controlled by the JOIN strictness in the query itself, not by a separate LOOKUP INDEX option.

@CurtizJ CurtizJ added the can be tested Allows running workflows for external contributors label Mar 31, 2026
@clickhouse-gh

clickhouse-gh Bot commented Mar 31, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [eca1052]

Summary:

job_name test_name status info comment
Performance Comparison (arm_release, master_head, 4/6) FAIL Performance dashboard
Check Results FAIL
Performance Comparison (arm_release, master_head, 5/6) FAIL Performance dashboard
decimal_aggregates #17::old FAIL query history
decimal_aggregates #17::new FAIL query history
encrypt_decrypt_empty_string #0::old FAIL query history
encrypt_decrypt_empty_string #0::new FAIL query history
encrypt_decrypt_empty_string #10::old FAIL query history
encrypt_decrypt_empty_string #10::new FAIL query history

AI Review

Summary

This PR adds experimental table-wide LOOKUP INDEX support for MergeTree, covering table_set membership lookups and table_join direct joins. The current implementation still has one unresolved access-policy issue in the table_join fast path: it can read and cache physical columns outside the query's checked column set.

Findings
  • ⚠️ Majors
    • [src/Planner/PlannerJoins.cpp:1139] table_join lookup cache construction can read and cache physical columns the user was not granted SELECT on. The normal table-expression planning checks only getSelectedColumnsNames, but MergeTreeData::tryGetLookupJoin reads every physical column via getAllPhysicalColumnsForLookupJoin, so a join that only needs the key and one payload column can still populate a shared entity containing protected columns.
    • Suggested fix: either build the entity from the required right-side columns plus lookup keys, or perform SELECT access validation on the full lookup_columns set before tryGetLookupJoin can read and cache them.
Tests
  • ⚠️ Add a focused RBAC stateless test for table_join: grant a user SELECT only on the join key and one payload column, deny another physical column, enable allow_experimental_lookup_index, and verify the lookup path is not allowed to read or cache the denied column.
Final Verdict

Status: ⚠️ Request changes

Minimum required action: fix the table_join column-access gap and cover it with the RBAC regression above.

@clickhouse-gh clickhouse-gh Bot added the pr-feature Pull request with new product feature label Mar 31, 2026
@CLAassistant

CLAassistant commented Mar 31, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

Comment thread src/Storages/MergeTree/MergeTreeData.cpp
Comment thread src/Storages/MergeTree/MergeTreeData.cpp
Comment thread src/Planner/CollectSets.cpp Outdated
Comment thread src/Planner/PlannerJoinTree.cpp Outdated
Comment thread src/Planner/PlannerJoins.cpp
Comment thread src/Interpreters/InterpreterCreateQuery.cpp
Comment thread src/Storages/MergeTree/MergeTreeData.cpp

@clickhouse-gh clickhouse-gh Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitting pending review to unblock inline comments

Comment thread src/Planner/CollectSets.cpp
Comment thread src/Planner/PlannerJoinTree.cpp Outdated
Comment thread src/Interpreters/MergeTreeTableJoinEntity.cpp
alexey-milovidov and others added 3 commits June 18, 2026 19:57
Merging `master` pulled in commit `5188f66f7bf` ("Reject CREATE INDEX with
GRANULARITY 0"), which made `IndexDescription::getIndexFromAST` reject any
index declaration with `granularity == 0` and made
`ASTIndexDeclaration::formatImpl` always emit the `GRANULARITY` clause.

Both changes break `LOOKUP INDEX`. A lookup index is a table-wide in-memory
structure with no granularity, so the parser stores granularity 0 and rejects
an explicit `GRANULARITY` clause. The new check therefore rejected every
`CREATE TABLE ... LOOKUP INDEX ...` with `Index GRANULARITY must be a positive
integer (BAD_ARGUMENTS)`, and the unconditional format would have emitted
`GRANULARITY 0` for a lookup index, breaking the format/parse round-trip.

Fixes:
- `getIndexFromAST` skips the granularity check for lookup indexes
  (`is_lookup_index`); it still guards granule-based skip indexes.
- `formatImpl` emits `GRANULARITY` only for non-lookup indexes.
- `buildLookupIndices` marks the declaration as a lookup index (and resets
  granularity) before calling `getIndexFromAST`, so the exemption also applies
  to the metadata re-parse path (`parseLookupIndices`), which does not set the
  flag in the parser.

This restored all `LOOKUP INDEX` stateless tests (`04063`, `04064`, `04068`,
`04069`, `04244`, `04312`) that failed in the Fast test, while keeping master's
`04326_reject_create_index_granularity_zero` passing.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=101401&sha=c2f4395d6e9707ee4b48418b750ed39c4f84aaac&name_0=PR&name_1=Fast%20test

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A `table_join` lookup index serves an `ANY` join through a key-value direct
join (`MergeTreeTableJoinEntity` via `DirectKeyValueJoin`), which always returns
the first stored right row for a key. `join_any_take_last_row = 1` instead asks
an `ANY` join to keep the last duplicate-key right row, which only the regular
`HashJoin` honors. Without a guard, enabling a lookup index would silently
change `ANY` join results for duplicate keys when that setting is on.

Decline the lookup fast path in `tryExtractLookupJoinRightKeys` for `ANY`
strictness when `join_any_take_last_row` is enabled. Since
`tryMakeDirectJoinWithMergeTree` already declines `ANY`, the join then falls
back to `HashJoin`, which keeps the configured semantics. The setting has no
effect on `ALL`/`SEMI`/`ANTI` joins, so the lookup index is still used there.

Adds `04360_lookup_index_table_join_any_take_last_row`, which asserts via
`EXPLAIN` that the lookup (`DirectKeyValueJoin`) is used for `ANY` joins without
the setting and declined with it (while `ALL` keeps it), and that enabling the
index does not change `ANY` join results.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Planner/PlannerJoins.cpp
`tryGetLookupJoinStorage` could prepare a `table_join` lookup direct join
even when the right `MergeTree` table expression reads `ALIAS` columns. The
lookup entity `MergeTreeTableJoinEntity` is built from physical columns only
(`getAllPhysicalColumnsForLookupJoin`), but `DirectKeyValueJoin` later requests
the full right-side header, including the computed alias columns. For a query
like `SELECT d.val_alias FROM f JOIN d USING (id)` where
`val_alias ALIAS concat(val, '!')`, the optimized path asked
`MergeTreeTableJoinEntity::getSampleBlock` for `val_alias`, which is not in the
cache, raising `Cannot find column 'val_alias' in table lookup cache`; the
regular plan would have computed the alias.

Fall back to the regular join path when the right table expression reads alias
columns, so the alias is computed. Add `04401_lookup_index_alias_column` and a
documentation note.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Picked up the PR to address the open AI review finding.

Fix (192b203410a)

The single ⚠️ Major / unresolved thread was the table_join lookup fast path being selected when the right table expression reads ALIAS columns. The lookup entity MergeTreeTableJoinEntity is built from physical columns only (getAllPhysicalColumnsForLookupJoin), so DirectKeyValueJoin requesting a computed alias from it raised Cannot find column ... in table lookup cache instead of falling back.

tryGetLookupJoinStorage now falls back to the regular join path when the right table expression reads alias columns (getAliasColumnExpressions() non-empty). Added 04401_lookup_index_alias_column (alias select computes concat(val, '!'); mixed physical+alias select; EXPLAIN confirms no DirectKeyValueJoin when an alias is read but the fast path is kept for physical-only selects) plus a table_join limitations note in lookupindexes.md.

Verification

Built locally and ran all lookup-index stateless tests against a server — 04063, 04064, 04068, 04069, 04244, 04312, 04360, and the new 04401 all pass. The thread is resolved.

CI

The red checks on the reviewed commit are all unrelated master-wide flakes, confirmed against CIDB:

  • Stress (amd_tsan): Cannot start clickhouse-server (on master and 16+ PRs on the same day) and Unexpected exception in refresh scheduling (Logical error: 'Unexpected exception in refresh scheduling' (STID: 2508-3e7b) #106737, 129 failures / 100 PRs in 14 days);
  • AST fuzzer (arm_asan_ubsan): Block structure mismatch in A stream in UnionStep::updateOutputHeader (occurs on master PR=0 and ~19 PRs/day);
  • Performance Comparison (arm_release): unstable/slower noise across unrelated tests (number_formatting_formats, point_in_polygon_*, replaceRegexp_*, …); the PR's own lookup_index_* perf tests are not among the failures.

Not done (left for you)

  • I did not re-merge master (merge base is ~2 days old, well under a week, and the reds are flakes; the branch is MERGEABLE). Re-merging the full 659-commit delta into the heavily-churned join planner code seemed riskier than helpful.
  • The 2026-06-18 master merge left several stateless tests with duplicate number prefixes (04063, 04064, 04068, 04069, 04244, 04312 each now collide with unrelated master tests at the same number). This does not break CI (tests run by full name) but should be renumbered as part of the final master merge before merge.
  • No self-approve / merge.

Comment thread src/Planner/PlannerJoins.cpp
Comment thread src/Storages/MergeTree/MergeTreeData.cpp
alexey-milovidov and others added 4 commits June 22, 2026 17:20
`getInMemoryMetadataPtr(ContextPtr, bool)` now returns a
`StorageMetadataHandle` whose rvalue `operator->` is explicitly deleted to
prevent dereferencing a temporary handle. `hasLookupSetIndex` and
`hasLookupJoinIndex` called `->getLookupIndices()` directly on the returned
temporary, selecting the deleted overload and failing to compile. Bind the
handle to a named local before dereferencing it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ze limits

Addresses two AI review findings on `tryGetLookupJoinStorage`:

- The lookup-join entity (`MergeTreeTableJoinEntity`) is built from the
  top-level physical columns only (`getAllPhysicalColumnsForLookupJoin`). The
  previous guard only rejected `ALIAS` payload columns, but the analyzer can
  still map a right-side column identifier to a subcolumn (e.g. `m.keys`) or a
  virtual column (e.g. `_part`). Those are neither alias expressions nor present
  in the entity, so the fast path built an entity and `DirectKeyValueJoin` later
  failed with `Cannot find column ... in table lookup cache` instead of falling
  back to the regular plan that materializes the column. Now fall back unless
  every mapped right-side column is served by the lookup entity.

- The lookup build path never applied the per-query right-side join size limits
  (`max_rows_in_join` / `max_bytes_in_join`) that the regular join build
  enforces: a `table_join` lookup index reads and caches the whole right table,
  and later cache hits bypass the limits entirely. Decline the fast path when
  either limit is active so the regular join enforces them.

Adds `04408_lookup_index_table_join_subcolumn` and
`04409_lookup_index_table_join_size_limits`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The lookup fast path is disabled when parallel replicas can run on the
initiator (`!canUseParallelReplicasOnInitiator()`), so when the test harness
randomly enabled `enable_parallel_replicas` the physical-only join no longer
used `DirectKeyValueJoin` and the test failed intermittently. Pin
`enable_parallel_replicas = 0`, `max_parallel_replicas = 1` and
`serialize_query_plan = 0`, matching the other lookup index tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Picked up the PR to address the two open AI review findings on the table_join lookup fast path, plus a build break from the latest master merge.

Build fix (c679f483cc4)

getInMemoryMetadataPtr(ContextPtr, bool) now returns a StorageMetadataHandle whose rvalue operator-> is deleted to prevent dereferencing a temporary. hasLookupSetIndex / hasLookupJoinIndex called ->getLookupIndices() directly on the returned temporary, selecting the deleted overload and failing to compile. Bound the handle to a named local first.

Review fixes (68a76542eb2)

  • Subcolumns / virtual columns — the previous fallback only rejected ALIAS payload columns, but the analyzer can map a right-side identifier to a subcolumn (e.g. m.keys) or a virtual column (e.g. _part), which the physical-only lookup entity cannot serve, so DirectKeyValueJoin raised Cannot find column ... in table lookup cache. tryGetLookupJoinStorage now falls back unless every mapped right-side column is in the lookup entity column set (getColumns(AllPhysical), the same set getAllPhysicalColumnsForLookupJoin uses).
  • Join size limits — the lookup build path never honored max_rows_in_join / max_bytes_in_join. The fast path now declines when either limit is active so the regular join enforces them.

New tests 04408_lookup_index_table_join_subcolumn and 04409_lookup_index_table_join_size_limits.

Flaky test fix (18a4f6abb07)

04401_lookup_index_alias_column did not pin parallel replicas, so when the harness randomly enabled enable_parallel_replicas the lookup fast path was disabled (!canUseParallelReplicasOnInitiator()) and the physical-only assertion failed intermittently. Pinned enable_parallel_replicas = 0 / max_parallel_replicas = 1 / serialize_query_plan = 0, matching the other lookup index tests.

Verification

Built locally and ran the full lookup index suite (04063, 04064, 04068, 04069, 04244, 04312, 04360, 04401, 04408, 04409) against a server — all pass, repeatedly under the randomized settings. Both review threads are resolved.

Comment thread src/Storages/MergeTree/MergeTreeIndexTableLookup.cpp
`validateTableLookupKeys` accepted every `ASTIdentifier` as a plain lookup
key, but an unquoted subcolumn such as `m.keys` is also an `ASTIdentifier`
with `compound() == true`. Since `IndexDescription::initExpressionInfo`
analyzes with subcolumns enabled, `LOOKUP INDEX idx (m.keys) TYPE table_join`
was accepted even though the lookup entity is built only from top-level
physical columns and cannot serve a subcolumn, leaving an accepted but
unusable index. Reject such compound identifiers at validation time. A quoted
top-level column name containing a dot stays a single-part identifier
(`compound()` is false) and is still accepted.

Add `04411_lookup_index_compound_key_rejected` covering `CREATE` and
`ALTER ... ADD LOOKUP INDEX` for both `table_set` and `table_join`, plus the
quoted-dotted-name acceptance case.
…nfig

The `table_join` lookup fast path declines whenever `max_rows_in_join` or
`max_bytes_in_join` is non-zero (the build does not enforce these limits, so it
falls back to a regular join that does). The standard stateless config
(`tests/config/users.d/limits.yaml`) sets both to a high ceiling (`10G`) "so it
will not limit anything, but we test code around it", which made the fast path
decline universally under Fast test and the lookup tests' `DirectKeyValueJoin`
assertions failed (expected `1`, got `0`).

Reset `max_rows_in_join` / `max_bytes_in_join` to `0` at session level in the
tests that assert the fast path fires, so they exercise it regardless of the
profile ceiling. `04409_lookup_index_table_join_size_limits` keeps its
per-query non-zero limits to assert the decline path; those override the
session reset.
…istory

`master` advanced the development version to `26.7` and `02995_new_settings_history`
now requires every setting that is new relative to the `26.6.1` baseline to be
recorded in `SettingsChangesHistory.cpp` for a version greater than `26.6`. Move
`allow_experimental_lookup_index` from the `26.6` section to the `26.7` section
so the check passes.
@alexey-milovidov

Copy link
Copy Markdown
Member

Pushed 18a4f6abb07..96ae2957a0f (3 commits) fixing the Fast-test failures from the last manual approve run and the remaining validation thread.

What failed and why

The previous run went red on six table_join lookup tests (04064, 04068, 04360, 04401, 04408, 04409) — all result differs with reference, DirectKeyValueJoin expected 1, got 0 — plus 02995_new_settings_history.

Root cause of the six: the earlier fix that declines the table_join lookup fast path when max_rows_in_join != 0 || max_bytes_in_join != 0. The standard stateless config tests/config/users.d/limits.yaml sets both to 10G ("so it will not limit anything, but we test code around it"), so under the Fast-test config the fast path declined universally and DirectKeyValueJoin was absent everywhere.

Commits

  1. Reject compound subcolumn keys in LOOKUP INDEX validationvalidateTableLookupKeys now rejects identifier->compound(), so LOOKUP INDEX idx (m.keys) TYPE table_join is rejected instead of being accepted as an unusable index built from top-level physical columns. A quoted dotted name (`a.b`) is a single-part identifier and is still accepted. New test 04411_lookup_index_compound_key_rejected. (Resolves the open MergeTreeIndexTableLookup.cpp thread.)
  2. Reset join size limits in the table_join lookup testsSET max_rows_in_join = 0, max_bytes_in_join = 0 at session level in the tests that assert the fast path fires, so they exercise it regardless of the profile ceiling. 04409 keeps its per-query non-zero limits for the decline assertions.
  3. Record allow_experimental_lookup_index under 26.7 in SettingsChangesHistory.cppmaster advanced to 26.7, and 02995 requires new settings (vs the 26.6.1 baseline) to be recorded for a version > 26.6.

Verification (local, real server)

Reproduced the Fast-test condition by mirroring limits.yaml (all four max_rows/bytes_in_join/set = 10G) in the default profile and adding a parallel_replicas cluster:

  • Full 11-test lookup suite passes with randomization off and 5/5 with randomization on.
  • 02995 verified manually (no-cpu-aarch64, can't run via the harness on ARM): allow_experimental_lookup_index is absent from the 26.6.1 baseline and recorded for 26.7, so the new-setting check is empty.

Design question to confirm (not changed here)

The decline-on-non-zero behavior means the table_join fast path is disabled whenever any join size ceiling is set, including the standard 10G test profile and any production profile that sets max_rows_in_join / max_bytes_in_join — the same applies to table_set with max_rows_in_set / max_bytes_in_set. That is fail-safe (it never silently ignores a user-set limit), but it also means the feature is off under most non-trivial profiles. The alternative is to enforce the limit during the lookup build and keep the fast path enabled. I've kept the current fail-safe behavior; whether to switch to enforce-in-build is a design call for you to make.

Comment thread src/Planner/CollectSets.cpp Outdated
The `IN table` / `IN (SELECT key FROM table)` lookup fast path in
`tryGetLookupSetFromTableExpression` builds the `Set` directly from storage
(`MergeTreeData::tryGetLookupSet`) and registers it with `sets.addFromStorage`,
so the right-hand table is never planned. The regular path plans that table
(or its subquery) and performs a `SELECT` access check on the columns it reads
(`checkAccessRights` / `checkAccessRightsForSubquery` in `PlannerJoinTree`),
which the fast path skipped. A user holding `SELECT` only on the left table
could therefore probe the keys of a dimension table it has no access to through
`IN table`, but only when the fast path actually fires (no active
`max_rows_in_set` / `max_bytes_in_set` limit).

Perform the same `SELECT` access check on the lookup key columns before using
the lookup `Set`, in both the direct `IN table` and the trivial
`IN (SELECT key FROM table)` branches, matching the regular path's behaviour.

Add `04412_lookup_index_table_set_access` covering both forms with and without
access (the test resets `max_rows_in_set` / `max_bytes_in_set` to 0 so the fast
path fires under the standard config).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Picked up the PR after the manual approve CI run on 96ae2957a0f.

CI run result

The Fast-test / stateless suite is green — all lookup-index tests pass, confirming the size-limit and validation fixes from the previous run landed correctly. The remaining red checks are all unrelated master-wide flakes (confirmed against CIDB):

  • Stress (amd_tsan)Hung check failed, possible deadlock found (Hung check failed, possible deadlock found #107941; the log tail is an unrelated 00534_functions_bad_arguments10.sh).
  • Stress (arm_ubsan)Coordination mode mismatch for stream ... (STID 6117-68e9), a recurring parallel-replicas stress flake that hits master itself and ~30 unrelated PRs over the last month.
  • Performance Comparison (arm_release)sort_patterns unstable / noise, unrelated to the lookup index.
  • CH Inc sync, Mergeable Check, PR — internal sync / meta checks.

Fix this run (0fde27aa63a)

The CI re-review raised a new SELECT-access finding on the table_set lookup fast path, and it is real. tryGetLookupSetFromTableExpression builds the Set directly from storage and registers it with sets.addFromStorage, so the right-hand table is never planned and the regular path's SELECT access check (checkAccessRights / checkAccessRightsForSubquery in PlannerJoinTree) was skipped. A user holding SELECT only on the left table could probe a dimension table's keys through IN table — but only when the fast path actually fires, i.e. with no active max_rows_in_set / max_bytes_in_set limit (the standard config's high ceiling makes the fast path decline, which is why the existing tests did not surface it).

The fast path now performs the same SELECT access check on the lookup key columns before using the Set, in both the direct IN table and the trivial IN (SELECT key FROM table) branches.

Verification (server with RBAC, key columns granted vs. not): with the fast path firing (max_rows_in_set = 0), an unauthorized user is now denied — the denial originates from CollectSets.cpp itself, confirming the fast path enforces it — while an authorized user gets correct results; without the fast path the behaviour is unchanged. New test 04412_lookup_index_table_set_access covers both IN forms with and without access; existing 04063 / 04069 / 04244 / 04312 still pass. The review thread is resolved.

Did not re-merge master: the branch is ~2 days behind and the red checks are flakes, not staleness.

Comment thread docs/en/engines/table-engines/mergetree-family/lookupindexes.md
alexey-milovidov and others added 2 commits June 28, 2026 10:38
The `table_set` lookup fast path in `tryGetLookupSetFromTableExpression`
performed the `SELECT` access check on the lookup key columns *after*
calling `storage->tryGetLookupSet`, which reads the right-hand table and
may populate the shared lookup `Set` cache. An unauthorized user was
denied the final result, but could still force a full read and cache
build of a table they cannot `SELECT`.

Move the `checkAccess(AccessType::SELECT, storage_id, key_names)` call
before `tryGetLookupSet` in both the direct `IN table` branch and the
trivial `IN (SELECT key FROM table)` subquery branch, so access is
enforced before any read happens. Access enforcement is already covered
by `04412_lookup_index_table_set_access`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The `Using table_set` example created `dim` with three columns
(`id`, `subid`, `value`) but used `(id, subid) IN dim`. The direct
`IN table` form builds the right-hand set from the table's columns, so
this both fails with a column-count mismatch against the two-element
left-hand tuple and does not match the two-column `LOOKUP INDEX idx_set
(id, subid)`, so it never exercises the lookup index.

Use a key-only `dim_keys` table for the direct `IN table` example so it
is executable and actually uses the index, and show the trivial-subquery
form `IN (SELECT id, subid FROM dim)` for dimension tables that also
store non-key columns.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Addressed the two ⚠️ Request changes required actions from the AI review on f07b5390c56:

  1. table_set access check ordering (aa73fffc0cd): in tryGetLookupSetFromTableExpression, the SELECT access check on the lookup key columns now runs before storage->tryGetLookupSet(...) in both branches (direct IN table and the trivial IN (SELECT key FROM table) subquery). Previously the check ran after the read, so an unauthorized user — while denied the final result — could still force a read and lookup-Set cache build of a table they cannot SELECT. The check is cheap and idempotent: when no matching table_set index exists, tryGetLookupSet returns early anyway and the regular IN plan performs the same check, so there is no behavior change on the fallback path. Access enforcement stays covered by 04412_lookup_index_table_set_access.

  2. Docs example (eca1052f527): the Using table_set example built dim with id, subid, value but used (id, subid) IN dim. The direct IN table form builds the right-hand set from all of the table's columns, so that both failed with a column-count mismatch and never matched the two-column LOOKUP INDEX idx_set (id, subid). The direct example now uses a key-only dim_keys table so it is executable and actually uses the index; the trivial-subquery form IN (SELECT id, subid FROM dim) is shown for dimension tables that also store non-key columns.

CollectSets.cpp was verified to compile cleanly against the current (post-merge) headers. Both review threads are resolved.

The remaining red checks are unrelated master-wide flakes, not caused by this PR (which is SELECT-path only): Stress arm_asan_ubsan Hung check failed (#107941); Upgrade check Error message in clickhouse-server.log = a leftover background mutation Cannot parse string 'x' as UInt64 in MutatePlainMergeTreeTask (unrelated to lookup indexes); and Performance Comparison entries that the perf report classifies as unstable/noise (No significant performance changes detected). All Fast/stateless tests, including the lookup tests, pass.

storage_key_names.push_back(table_column_name_it->second);
}

result.storage_key_value = storage->tryGetLookupJoin(storage_key_names, planner_context->getQueryContext());

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lookup join path can read and cache columns the query did not have SELECT access to. prepareBuildQueryPlanForTableExpression only checks table_expression_data.getSelectedColumnsNames(), but MergeTreeData::tryGetLookupJoin rebuilds the entity from getAllPhysicalColumnsForLookupJoin, i.e. every physical column. A user granted only the join key and requested payload columns can therefore run a join that calls this line and forces a read and cache build containing protected physical columns that the normal join plan would never read.

Please either build the table_join entity from the columns required by this join plus keys, or check SELECT on the full lookup_columns set before tryGetLookupJoin can read and cache them.

@clickhouse-gh

clickhouse-gh Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.30% 85.30% +0.00%
Functions 92.60% 92.60% +0.00%
Branches 77.50% 77.60% +0.10%

Changed lines: Changed C/C++ lines covered by tests: 941/1112 (84.62%) | Lost baseline coverage (was covered on master, now uncovered in this PR): 44 line(s) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors manual approve Manual approve required to run CI pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants