Fast single-user look-up in system.users by alistairjevans · Pull Request #96699 · ClickHouse/ClickHouse · GitHub
Skip to content

Fast single-user look-up in system.users#96699

Merged
alexey-milovidov merged 33 commits into
ClickHouse:masterfrom
alistairjevans:alistairjevans/users-predicate-pushdown
Jun 25, 2026
Merged

Fast single-user look-up in system.users#96699
alexey-milovidov merged 33 commits into
ClickHouse:masterfrom
alistairjevans:alistairjevans/users-predicate-pushdown

Conversation

@alistairjevans

@alistairjevans alistairjevans commented Feb 11, 2026

Copy link
Copy Markdown
Contributor

Make look-up of a single user O(1) when using system.users with a literal name to do an existence check.

  • Make existence checks (e.g. SELECT 1 FROM system.users WHERE name = 'user1') fast on servers with 1000s of users
  • Without relying on clickhouse errors that generate server logs, e.g. SHOW GRANTS FOR, which is O(1) but generates a lot of log noise when the user doesn't exist.

This change looks for the literal name predicate in the incoming query, and if it finds one, short-circuits the "load the entire list of all users", and fetches just the user we care about.

Changelog category:

  • Performance Improvement

Changelog entry:

Improves the performance of fetching a single user record from system.users when the server has many users.


Note

Medium Risk
Medium risk because it changes system.users filtering behavior by parsing ActionsDAG predicates (including IN sets) and could miss/incorrectly handle edge predicate forms, though it falls back to full scans when extraction fails.

Overview
system.users now detects simple name = 'literal' and name IN (...) predicates and performs direct AccessControl lookups instead of iterating findAll<User>(), improving existence-check and small-name-filter query performance.

This introduces predicate parsing helpers (handling AND, aliases, and explicit IN sets), wires fillData() to receive the predicate, and adds getFilterSampleBlock() for the name column; a new stateless test covers fast-path and fallback behavior.

Reviewed by Cursor Bugbot for commit f750328. Bugbot is set up for automated code reviews on this repo. Configure here.

Version info

  • Merged into: 26.7.1.77

@qoega qoega added the can be tested Allows running workflows for external contributors label Feb 11, 2026
@clickhouse-gh

clickhouse-gh Bot commented Feb 11, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-performance Pull request with some performance improvements label Feb 11, 2026
@qoega

qoega commented Feb 11, 2026

Copy link
Copy Markdown
Member

Seems nice as you don't need to store more data for this optimization. Everything is already there.

@alistairjevans

alistairjevans commented Feb 25, 2026

Copy link
Copy Markdown
Contributor Author

Any blockers on merging this, @qoega ?

@alexey-milovidov alexey-milovidov self-assigned this Mar 22, 2026

/// If the predicate is a simple `name = 'literal'`, extract the literal string.
/// Returns std::nullopt for any other predicate shape — the caller falls back to the full scan.
std::optional<String> tryExtractNameFromPredicate(const ActionsDAG::Node * predicate)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is a good change!

I also remember that we have a generic way to do this using VirtualColumnUtils. I will search for it...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked — VirtualColumnUtils::filterBlockWithPredicate works the other way around: it takes a full list and filters it down. There is no generic utility to extract specific values from a predicate without having the full list first. StorageSystemZooKeeper has its own bespoke extractPathImpl for the same purpose. So the author's approach here (custom tryExtractNameFromPredicate) is the right pattern for an O(1) lookup.

alexey-milovidov and others added 3 commits March 22, 2026 11:47
Use the same approach as `StorageSystemZooKeeper::extractPathImpl`:
handle `name = 'literal'`, `name IN ('a', 'b')`, and `AND` conjunctions
for O(1) user lookups, instead of only bare equality.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
It was needed for `VirtualColumnUtils` filtering, which is not used here —
predicate extraction handles everything directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@alexey-milovidov alexey-milovidov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread src/Storages/System/StorageSystemUsers.h
alexey-milovidov and others added 2 commits March 25, 2026 01:32
…llData`

Without this override, `IStorageSystemOneBlock::applyFilters` never builds
the filter DAG for this table, so `predicate` is always nullptr and the
fast O(1) lookup path is never reached.

Also remove fragile `query_log` read_rows assertions from the test — the
`read_rows` count for system table queries depends on too many factors
(randomized settings, parallel replicas, etc.) to be reliably checked.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This is a deterministic way to assert the optimization works: the query
will fail if more rows are read than expected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread src/Storages/System/StorageSystemUsers.cpp
Use clickhouse-local for a fully isolated environment with no pre-existing
users (except default). This removes the need for no-parallel tag and
makes the test deterministic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread src/Storages/System/StorageSystemUsers.cpp Outdated
@alistairjevans

Copy link
Copy Markdown
Contributor Author

Thanks for pushing this @alexey-milovidov ! 🙏

@alistairjevans

alistairjevans commented Apr 15, 2026

Copy link
Copy Markdown
Contributor Author

@alexey-milovidov , fuzzer came out with:

`[src/Storages/System/StorageSystemUsers.cpp:64-68]` `extractNamesFromPredicateImpl` treats `and` as “union of supported children” instead of all-or-nothing extraction. 
For predicates like `name = 'alice' AND 0` (or contradictory conjuncts), the fast path
may still fetch users, and with strict limits like `max_rows_to_read`, 
this can raise an exception although full predicate evaluation would produce zero rows without reading user rows.

Suggested fix: make `AND` extraction strict; if any conjunct is not
a supported `name = const` / `name IN const-set` form, 
abort extraction and fall back to full scan (`std::nullopt`).

Do you concur, that we should be more strict about the optimization?

Required because the gh OAuth token used to push to the fork lacks
the `workflow` scope to update `.github/workflows/*` files brought
in by the master merge.
@clickhouse-gh clickhouse-gh Bot added the manual approve Manual approve required to run CI label May 20, 2026
Workflow files in this branch were stale or had been deleted as a
workaround for GitHub's workflow-scope check. Restoring them to match
`master` so the PR diff no longer contains those unrelated changes.
Comment thread src/Storages/System/StorageSystemUsers.cpp Outdated
@alexey-milovidov alexey-milovidov added this pull request to the merge queue Jun 8, 2026
@alexey-milovidov alexey-milovidov removed this pull request from the merge queue due to a manual request Jun 8, 2026
@alexey-milovidov

Copy link
Copy Markdown
Member

The review is on point. Let's first construct a test to demonstrate the incorrect behavior, then find a way to fix it.

@alexey-milovidov alexey-milovidov removed the manual approve Manual approve required to run CI label Jun 8, 2026
`isNameNode` accepted any node whose `result_name` is `name` after
stripping aliases. Tighten it to also require
`ActionsDAG::ActionType::INPUT`, so the fast path keys off the genuine
`name` column from `getFilterSampleBlock` rather than anything that merely
shares the name. This is a precision/robustness hardening: the filter
sample block exposes `name` as an `INPUT` node, so the legitimate fast
path is unaffected, and predicates over a column named `name` that is not
the input fall back to the full scan.

Also add a regression query exercising a constant alias named `name` that
shadows the column (`SELECT 'alice' AS name FROM system.users WHERE
name = 'alice'`), which must return all users like the full scan.

Addresses a review comment by clickhouse-gh.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Pushed 9ac59f182ee addressing the remaining clickhouse-gh thread on isNameNode:

  • isNameNode now requires ActionsDAG::ActionType::INPUT in addition to result_name == "name", so the fast path only triggers on the genuine name column from getFilterSampleBlock.
  • Added a regression query for a constant alias shadowing the column (SELECT 'alice' AS name FROM system.users WHERE name = 'alice').

Honest caveat: I couldn't reproduce an actual behavior divergence for that query against the pre-change code (the ALIAS-strip already lands on a constant whose result_name isn't name, so it fell back correctly). The change is a precision/robustness hardening with no impact on the legitimate fast path — verified by building both with and without it; the new query returns all users in both cases, and the existing stateless + integration tests still pass. CI was already green; branch trial-merges cleanly with master.

Comment thread src/Storages/System/StorageSystemUsers.cpp
The fast path for `SELECT ... FROM system.users WHERE name = 'x'` /
`name IN (...)` looks up each requested name in every access storage. For a
single user or a short `IN` list this is far cheaper than `findAll<User>`, but
a very large requested set (a huge literal list or a subquery result) could
perform more lookups than a single full scan, making the "fast" path slower
than the fallback.

Cap the fast path at a deliberate limit of names; above it we fall back to the
full scan, which is exactly the behavior before this optimization, so the fast
path can never become slower than the original full scan.

Addresses a review comment on the pull request.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Addressed the Bugbot review comment about the unbounded IN fast path in 3bdfb9725f5.

The fast path looked up each requested name in every access storage, so a very large name IN (...) set (a huge literal list or a subquery result) could perform more lookups than a single full scan — the opposite of the intended contract. The fast path is now capped at a deliberate limit (max_names_for_fast_path = 1000); above the limit we fall back to the full findAll<User>() scan, which is exactly the pre-optimization behavior, so the fast path can never become slower than the full scan. Single-user lookups and short IN lists (the target case) are unchanged.

The limit value is a heuristic — there is no cheap way to count users without findAll, so a fixed bound is the pragmatic choice — and is easy to tune if you prefer a different number.

Extended 03914_system_users_predicate_pushdown.sh with a 1500-element IN set (above the limit) that must still return the correct result via the fallback. All cases pass locally.

@alexey-milovidov your own VirtualColumnUtils note is left as-is (it concluded the custom tryExtractNameFromPredicate is the right pattern, like StorageSystemZooKeeper::extractPathImpl).

Comment thread src/Storages/System/StorageSystemUsers.cpp Outdated
The `name = 'literal'` fast path was guarded by `value->column->size() != 1`.
But `ActionsDAG::addColumn` normalizes a constant node to a `ColumnConst` of
size 0 (see the comment on `ActionsDAG::Node::column`), so the guard rejected
every literal and `extractNamesFromPredicateImpl` returned `nullopt` — sending
the headline `name = 'literal'` lookup back to `findAll<User>`. The optimization
never actually ran for equality. Read the underlying value directly without a
size guard, exactly like `StorageSystemZooKeeper` does for its `path` predicate.

Also bound the `IN` fast path before materializing a large set: as soon as the
set exceeds `max_names_for_fast_path`, return `nullopt` and fall back to the full
scan instead of copying every element into `names` first. This keeps the promise
that the fast path is never more expensive than the original behavior.
`max_names_for_fast_path` is hoisted to the anonymous namespace so extraction and
`fillData` share the same limit.

The previous stateless test could not catch the equality regression: its output
is identical whichever path runs, and the `max_rows_to_read` assertions it relied
on are vacuous because that limit is not enforced for the single-chunk
`ReadFromSystemOneBlock` source. Rework it to run against the server and witness
which path ran via `system.query_log` `read_rows` (the fast path materializes only
the matching users, the full scan materializes every user). Names are uniquely
prefixed for parallel safety.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Pushed 6f64502 addressing the two automated review findings:

1. Equality fast path was broken (the headline feature). The name = 'literal' path was guarded by value->column->size() != 1, but ActionsDAG::addColumn normalizes a constant node to a ColumnConst of size 0 (see the comment on ActionsDAG::Node::column), so the guard rejected every literal and extractNamesFromPredicateImpl returned nullopt — sending the lookup back to findAll<User>. The fast path never ran for equality. Fixed by reading the underlying value directly without a size guard, exactly like StorageSystemZooKeeper does for its path predicate.

2. IN fast path now bails before materializing a large set. As soon as the set exceeds max_names_for_fast_path it returns nullopt and falls back to the full scan, instead of copying every element first.

Test reworked to actually witness the fast path. The previous test could not catch issue (1): the output is identical whichever path runs, and its max_rows_to_read assertions are vacuous because that limit is not enforced for the single-chunk ReadFromSystemOneBlock source. It now runs against the server and checks system.query_log.read_rows — the fast path materializes only the matching users (verified locally: read_rows = 1 for equality, 2 for a 2-element IN, 0 for a missing user; the LIKE and large-IN fallbacks read all users). The integration test for duplicate names across storages also becomes meaningful now that the equality fast path actually runs.

The only red CI check is Stress test (arm_asan_ubsan), which fails to start the server with Too many marks in file ... skp_idx_idx.cmrk4 ... (CANNOT_READ_ALL_DATA) on a randomized-settings text index over an encrypted S3 cache. That is unrelated to system.users predicate pushdown.

alexey-milovidov and others added 2 commits June 14, 2026 23:22
The style check requires that any test querying `system.query_log` or
`system.query_thread_log` includes a `current_database = currentDatabase()`
condition, so it does not read entries written by concurrently running
tests. The `read_rows` helper in `03914_system_users_predicate_pushdown.sh`
filtered only by a unique `query_id`, which was effectively isolated but did
not satisfy the textual style check.

The measured queries run in `$CLICKHOUSE_DATABASE`, so the added condition
does not change the result.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Storages/System/StorageSystemUsers.cpp
alexey-milovidov and others added 2 commits June 20, 2026 04:46
…em.users fast path

Addresses the remaining automated review findings on the `system.users`
name-predicate fast path:

1. NULL string constants no longer raise an exception. A `Nullable(String)`
   constant passes the type check, but `name = CAST(NULL, 'Nullable(String)')`
   would reach `ColumnNullable::getDataAt`, which throws on a `NULL`. The
   equality path now treats a `NULL` constant as an unsatisfiable predicate
   (empty candidate set, so the fast path emits nothing), and the `IN` path
   skips `NULL` elements. Both match the full-scan semantics of returning no
   rows for such predicates.

2. A large literal `name IN (...)` no longer materializes the whole
   right-hand side before falling back. The set size is now checked via the
   already-built set's `getTotalRowCount` before calling
   `buildOrderedSetInplace`, which (with the default
   `use_index_for_in_with_subqueries_max_values = 0`) would otherwise
   duplicate every element only for the cap at `elements->size()` to send it
   to the full scan. The post-build size check is kept for set kinds whose
   size is known only after building (e.g. subquery sets).

Extended `03914_system_users_predicate_pushdown` with focused coverage for
`name = NULL` and `name IN (NULL, ...)` proving the fast path preserves
full-scan semantics without raising an exception.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Pushed f9e6039c404 addressing the two remaining automated review findings (the ⚠️ Request changes verdict), then merged the latest master.

1. NULL string constants no longer raise an exception. A Nullable(String) constant passes the type check, but name = CAST(NULL, 'Nullable(String)') would reach ColumnNullable::getDataAt, which throws on a NULL. The equality path now returns an empty candidate set when the constant isNullAt(0) (the fast path emits nothing, matching the full scan), and the IN loop skips NULL elements via isNullAt(i). ColumnConst::isNullAt forwards to the underlying value, so this is correct even though ActionsDAG normalizes the constant to a ColumnConst of size 0.

2. A large literal name IN (...) no longer materializes the right-hand side before falling back. The set size is now checked via the already-built set's getTotalRowCount() before calling buildOrderedSetInplace. For a literal tuple, FutureSetFromTuple fills its Set at construction, so the row count is available without copying anything; above max_names_for_fast_path we fall back to the full scan. This stops a large literal list from being duplicated/materialized just to reach the existing cap (the default use_index_for_in_with_subqueries_max_values = 0 does not bound it). The post-build elements->size() guard is kept for set kinds whose size is known only after building (e.g. subquery sets).

Tests. Extended 03914_system_users_predicate_pushdown with focused coverage for name = NULL and name IN (NULL, '<user>'), proving the fast path preserves full-scan semantics without raising an exception (correctness-only, so the assertions hold regardless of whether the fast path or the fallback runs).

The edited translation unit compiles cleanly; the change touches only stable Set/PreparedSets/Column APIs, and all pre-existing test outputs are preserved (the 1500-element large-IN case still falls back and returns the same result, now without materializing first).

The remaining red CI checks are unrelated: Stress test (arm_tsan) fails to start the server with Logical error: 'Unexpected exception in refresh scheduling' (STID 2508-3e7b, tracked by #106737, in BackgroundSchedulePool refresh scheduling, nothing to do with system.users), and the Performance Comparison (master_head) deltas are run-to-run noise (the cloud-benchmark verdict was no_change).

Comment thread src/Storages/System/StorageSystemUsers.cpp Outdated
alexey-milovidov and others added 2 commits June 24, 2026 10:10
Addresses the remaining automated review finding on the `system.users`
name-predicate fast path.

The large-`IN` guard checked the *deduplicated* row count
(`Set::getTotalRowCount`), but building the explicit set elements
(`buildOrderedSetInplace` -> `Set::appendSetElements`) filters the *original*
right-hand side columns, so its cost is proportional to the original list
length, not the distinct count. A duplicate-heavy or mostly-`NULL` literal such
as `name IN ('x', 'x', ..., 'y')` could therefore stay under the limit by
distinct count yet still force a scan of the whole right-hand side, making the
fast path slower than the full scan it was meant to beat.

`FutureSetFromTuple` now exposes `getInputRowCount` (the original right-hand
side length, available in O(1) from the deduplication filter built at
construction). The `IN` fast path bounds on it before `buildOrderedSetInplace`
and falls back to the full scan above `max_names_for_fast_path`, like the
existing distinct-count and post-build guards.

Extended `03914_system_users_predicate_pushdown` with a duplicate-heavy `IN`
(few distinct values, long list) - asserting via `read_rows` that it falls back
to the full scan - and a mostly-`NULL` `IN`, both proving correctness is
preserved.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Addressed the automated review's remaining finding (the ⚠️ Request changes verdict on the large-IN cap) in deaeed3de3a.

The large-IN guard checked the deduplicated row count (Set::getTotalRowCount), but building the explicit set elements (buildOrderedSetInplaceSet::appendSetElements) filters the original right-hand side columns, so a duplicate-heavy or mostly-NULL list such as name IN ('x', 'x', ..., 'y') could stay under the limit by distinct count yet still force a scan of the whole right-hand side. FutureSetFromTuple now exposes getInputRowCount() (the original list length, O(1) from the construction-time deduplication filter), and the IN fast path bounds on it before buildOrderedSetInplace.

03914_system_users_predicate_pushdown.sh gains a duplicate-heavy IN case (asserting via read_rows that it falls back to the full scan) and a mostly-NULL IN case.

@clickhouse-gh

clickhouse-gh Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.40% 85.40% +0.00%
Functions 92.50% 92.60% +0.10%
Branches 77.60% 77.60% +0.00%

Changed lines: Changed C/C++ lines covered by tests: 97/116 (83.62%) | Lost baseline coverage: none · Uncovered code

Full report · Diff report

@alexey-milovidov alexey-milovidov merged commit 6076f2d into ClickHouse:master Jun 25, 2026
487 of 493 checks passed
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-performance Pull request with some performance improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants