Add experimental cuckoo_filter and binary_fuse_filter skip indexes by zex-hyd · Pull Request #101796 · ClickHouse/ClickHouse · GitHub
Skip to content

Add experimental cuckoo_filter and binary_fuse_filter skip indexes#101796

Open
zex-hyd wants to merge 20 commits into
ClickHouse:masterfrom
zex-hyd:feature/cuckoo-filter-skip-index
Open

Add experimental cuckoo_filter and binary_fuse_filter skip indexes#101796
zex-hyd wants to merge 20 commits into
ClickHouse:masterfrom
zex-hyd:feature/cuckoo-filter-skip-index

Conversation

@zex-hyd

@zex-hyd zex-hyd commented Apr 4, 2026

Copy link
Copy Markdown

Changelog category (leave one):

  • Experimental Feature

Related Issue:

#101795 (Cuckoo Filter)
#104466 (Binary Fuse Filter)

Changelog entry:

Added experimental cuckoo_filter and binary_fuse_filter skip index types for MergeTree tables. Both are gated by experimental settings and documented as workload-dependent alternatives to bloom_filter, with stateless size-benchmark coverage; the binary-fuse CI benchmark also includes a tiny pruning sanity check.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Summary

This PR implements and benchmarks two experimental probabilistic filter structures for MergeTree skip indexes: cuckoo_filter and binary_fuse_filter.

Both structures are based on published research on compact approximate membership filters for large static sets. Benchmark results in the stateless tests show that they are not general replacements for bloom_filter in ClickHouse's current skip-index model. In particular, at skip-index GRANULARITY = 1 and a common target FPR such as 0.025, both filters are often larger than bloom_filter because of fixed structural overhead on small per-index-granule key sets.

The feature remains experimental and gated. bloom_filter remains the recommended default skip index unless workload-specific benchmarks show an advantage for one of these filters.

Implemented Index Types

1. cuckoo_filter

Based on Fan et al., "Cuckoo Filter: Practically Better Than Bloom" (ACM CoNEXT 2014), with secondary-index motivation from Kipf et al., "Cuckoo Index" (PVLDB 2020).

Parameters: Single argument — target false positive rate, for example 0.01.

Experimental setting: Requires allow_experimental_cuckoo_filter_index = 1.

2. binary_fuse_filter

Based on Graf & Lemire, "Binary Fuse Filters: Fast and Smaller Than Xor Filters" (arXiv:2201.01174, 2022).

Parameters: Single argument — target false positive rate, for example 0.025.

Experimental setting: Requires allow_experimental_binary_fuse_filter_index = 1.

Benchmark Results

Both filters were evaluated against bloom_filter in 04104_cuckoo_filter_benchmark.sql and 04204_binary_fuse_filter_benchmark.sql.

Methodology: 10,000 rows per table, table index_granularity = 8192, skip-index GRANULARITY = 1, OPTIMIZE FINAL, then sum(secondary_indices_uncompressed_bytes) from system.parts. Each file covers 18 core scenarios: 6 cardinality regimes × 3 FPR values. 04204 adds representative CI-friendly extensions (lower FPR snapshot, skip-index GRANULARITY sensitivity, distribution-sensitive cases, and a tiny pruning sanity check).

Cuckoo Filter vs Bloom

At skip-index GRANULARITY = 1 and fpr = 0.025, cuckoo_filter was larger than bloom_filter in all six tested cardinality regimes (04104 reference).

Scenario FPR Bloom Cuckoo Ratio
single value 0.025 2 B 15 B 7.5x
1% distinct 0.025 101 B 139 B 1.4x
10% distinct 0.025 1,002 B 2,061 B 2.1x
50% distinct 0.025 5,002 B 8,205 B 1.6x
90% distinct 0.025 9,002 B 16,397 B 1.8x
all distinct 0.025 10,002 B 16,397 B 1.6x

Observed range at fpr = 0.025: 1.4x–7.5x larger than Bloom.

At the lower fpr = 0.001 (where the requested FPR is rounded up to the nearest supported fingerprint width), cuckoo_filter remains larger than bloom_filter in every tested cardinality regime as well (04104 reference):

Scenario FPR Bloom Cuckoo Ratio
1% distinct 0.001 189 B 267 B 1.4x
10% distinct 0.001 1,877 B 4,109 B 2.2x
50% distinct 0.001 9,377 B 16,397 B 1.7x
90% distinct 0.001 16,877 B 32,781 B 1.9x
all distinct 0.001 18,752 B 32,781 B 1.7x

Binary Fuse Filter vs Bloom

At skip-index GRANULARITY = 1 and fpr = 0.025, binary_fuse_filter was also larger than bloom_filter in all six core cardinality regimes (04204 reference).

Scenario FPR Bloom Binary Fuse Ratio
single value 0.025 2 B 12 B 6.0x
1% distinct 0.025 101 B 187 B 1.9x
10% distinct 0.025 1,002 B 1,396 B 1.4x
50% distinct 0.025 5,002 B 6,293 B 1.3x
90% distinct 0.025 9,002 B 11,029 B 1.2x
all distinct 0.025 10,002 B 12,309 B 1.2x

Observed range at fpr = 0.025: 1.2x–6.0x larger than Bloom.

However, binary_fuse_filter can be smaller in specific FPR regimes where its fingerprint width is favorable. For example, at fpr = 0.001:

Scenario FPR Bloom Binary Fuse Ratio
1% distinct 0.001 189 B 187 B 0.99x
10% distinct 0.001 1,877 B 1,396 B 0.74x
50% distinct 0.001 9,377 B 6,293 B 0.67x
90% distinct 0.001 16,877 B 11,029 B 0.65x
all distinct 0.001 18,752 B 12,309 B 0.66x

Additional representative CI-friendly coverage in 04204_binary_fuse_filter_benchmark.sql: lower-FPR snapshot (fpr = 0.005), skip-index GRANULARITY = 1 / 8 / 64, distribution-sensitive scenarios, and a tiny hit/miss pruning sanity check. 04104 covers the full 18-scenario size matrix for cuckoo but does not include those extensions.

Root Cause Analysis

The common issue is structural overhead in small-set regimes:

  • cuckoo_filter: per-index-granule bucket count rounds up to a power of two; fixed overhead from slot structure, load-factor requirements, and fingerprint-width choices cannot amortize well for small distinct-key counts.
  • binary_fuse_filter: whole-byte fingerprints and power-of-two segment alignment introduce step changes in size. This can be favorable in some FPR regimes, but unfavorable at common settings such as fpr = 0.025.

Both structures are designed for larger static sets where structural overhead is easier to amortize. ClickHouse skip indexes often operate on much smaller per-index-granule key sets, so the benefit is workload- and parameter-dependent.

Why keep this experimental implementation?

Although the MergeTree skip-index results are workload-dependent, these implementations provide reusable probabilistic filter structures that may be useful beyond the current skip-index model, for example for runtime filters or future index layouts with larger key sets.

The user-facing skip-index integrations remain experimental and gated, and the documentation describes the measured limitations instead of presenting these filters as general bloom_filter replacements.

Test Files

  • 04077_cuckoo_filter_skip_index.sql — focused cuckoo predicate coverage vs bloom_filter: scalar equality/!=/NOT IN, scalar IN (including a forced miss set), hasAny/hasAll, tuple expression indexes, and map key paths; EXPLAIN sanity for scalar equality
  • 04299_binary_fuse_filter_skip_index.sql — same focused predicate coverage for binary_fuse_filter
  • 04319_experimental_skip_index_json_isnotnull_nested.sql — JSON path-presence correctness: bare isNotNull(json.path) uses the index; nested forms such as isNotNull(json.path) = 0 do not
  • 04078_cuckoo_filter_skip_index_bad_args.sql — cuckoo bad-argument validation, experimental-setting gates, and unchanged-index ALTER behavior
  • 04205_binary_fuse_filter_bad_args.sql — same for binary_fuse_filter
  • 04104_cuckoo_filter_benchmark.sql — full 18-scenario Cuckoo vs Bloom size benchmark (04104 reference)
  • 04204_binary_fuse_filter_benchmark.sql — 18 core scenarios plus representative CI-friendly extensions for Binary Fuse vs Bloom
  • 04103_cuckoo_filter_vs_bloom_filter_equivalence.sql — PREWHERE scalar-equality count equivalence vs bloom_filter
  • 04203_binary_fuse_filter_vs_bloom_filter_equivalence.sql — same for binary_fuse_filter

Both benchmark files use the same core methodology: same schema/data pattern, OPTIMIZE FINAL, then reading secondary_indices_uncompressed_bytes from system.parts.

Predicates Covered

The implemented index-condition paths follow existing bloom_filter skip-index semantics where applicable, while unsafe or unsupported forms fall back conservatively to avoid false negatives.

Covered paths exercised by the stateless tests include:

  • scalar equality (=)
  • scalar IN (including a forced miss set in 04077 / 04299)
  • array predicates hasAny and hasAll
  • tuple expression indexes (x = … AND y = … on a tuple index)
  • map key access paths (m['key'] = … with a mapKeys(m) index)
  • bare JSON path-presence checks such as isNotNull(json.path) (04319)

Not claimed as index-backed in tests:

  • nested boolean forms such as isNotNull(json.path) = 0
  • negation / inversion paths such as !=, NOT IN — compared against bloom_filter for result parity, but not forced through the skip index

More complex boolean forms should fall back to normal scanning rather than being forced through the skip index.

Example Usage

SET allow_experimental_cuckoo_filter_index = 1;
SET allow_experimental_binary_fuse_filter_index = 1;

CREATE TABLE t
(
    k UInt64,
    INDEX idx_cuckoo k TYPE cuckoo_filter(0.01) GRANULARITY 1,
    INDEX idx_bfuse k TYPE binary_fuse_filter(0.025) GRANULARITY 1
)
ENGINE = MergeTree
ORDER BY k;

@zex-hyd

zex-hyd commented Apr 4, 2026

Copy link
Copy Markdown
Author

Hi @alexey-milovidov @rschu1ze,

This PR is related to my feature request #101795 (from Intern Tasks 2025/2026).

Would appreciate feedback on the approach! Happy to adjust based on your suggestions.

@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label Apr 5, 2026
@clickhouse-gh

clickhouse-gh Bot commented Apr 5, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [05515d1]

Summary:

job_name test_name status info comment
Fast test FAIL
02995_new_settings_history FAIL cidb
Code Review DROPPED
Docs check DROPPED
Fast test (arm_darwin) DROPPED
Build (amd_debug) DROPPED
Build (amd_asan_ubsan) DROPPED
Build (amd_tsan) DROPPED
Build (amd_msan) DROPPED
Build (amd_binary) DROPPED
Build (arm_debug) DROPPED

@clickhouse-gh clickhouse-gh Bot added the pr-feature Pull request with new product feature label Apr 5, 2026
Comment thread src/Storages/MergeTree/MergeTreeIndexCuckooFilter.cpp Outdated
Comment thread tests/queries/0_stateless/04077_cuckoo_filter_skip_index.sql Outdated
Comment thread src/Storages/MergeTree/MergeTreeIndexCuckooFilter.cpp Outdated
Comment thread src/Storages/MergeTree/MergeTreeIndexCuckooFilter.cpp Outdated
@zex-hyd zex-hyd marked this pull request as draft April 5, 2026 17:09
Comment thread src/Storages/MergeTree/MergeTreeIndices.cpp Outdated
Comment thread src/Interpreters/CuckooFilter.cpp Outdated
Comment thread src/Storages/MergeTree/MergeTreeIndexCuckooFilter.cpp Outdated
Comment thread src/Interpreters/CuckooFilter.cpp Outdated
Comment thread src/Storages/MergeTree/MergeTreeIndexCuckooFilter.cpp Outdated
@alexey-milovidov

Copy link
Copy Markdown
Member

The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix.

Comment thread src/Interpreters/CuckooFilter.cpp Outdated
@zex-hyd zex-hyd force-pushed the feature/cuckoo-filter-skip-index branch from 79e4994 to a513b9c Compare April 8, 2026 22:51
@zex-hyd zex-hyd marked this pull request as ready for review April 8, 2026 22:51
@zex-hyd

zex-hyd commented Apr 9, 2026

Copy link
Copy Markdown
Author

The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix.

@alexey-milovidov The stress test is failing again, I have verified that this PR is up to date with master!

So it looks like a separate MSan issue (or flake) uncovered by stress, unrelated to the cuckoo_filter changes on our side!

@alexey-milovidov

Copy link
Copy Markdown
Member

The MSan stress test failure (MemorySanitizer: use-of-uninitialized-value, STID 4179-5154 or 4148-3044) is a known pre-existing issue unrelated to this PR. Fix: #102158

@zex-hyd

zex-hyd commented Apr 9, 2026

Copy link
Copy Markdown
Author

The MSan stress test failure (MemorySanitizer: use-of-uninitialized-value, STID 4179-5154 or 4148-3044) is a known pre-existing issue unrelated to this PR. Fix: #102158

@alexey-milovidov Thanks again for the clarafication! This PR is also ready for review!

Comment thread src/Storages/MergeTree/MergeTreeIndexCuckooFilter.cpp Outdated
@alexey-milovidov

Copy link
Copy Markdown
Member

The Hung check failure is fixed by #102008 and #102010, let's update the branch

@zex-hyd

zex-hyd commented Apr 10, 2026

Copy link
Copy Markdown
Author

The Hung check failure is fixed by #102008 and #102010, let's update the branch

Thanks for the help! I also make some fix on the comments with previous typo.

Comment thread src/Core/Settings.cpp
@zex-hyd

zex-hyd commented Apr 18, 2026

Copy link
Copy Markdown
Author

@alexey-milovidov @rschu1ze This PR has passed all the checks and is ready to be review! Please feel free to add comments on the PR! Thank you!

@rschu1ze

rschu1ze commented Apr 20, 2026

Copy link
Copy Markdown
Member

@zex-hyd We must only introduce new index types to ClickHouse if they are strictly better than the current ones in terms of performance and storage space. Please evaluate the Cuckoo filter indexes vs. Bloom filter indexes (ideally as systematically as possible) along these dimensions, you can use functional tests (tests/queries/0_stateless), SQL performance tests (tests/performance) or C++ Google Benchmark tests for this.

@zex-hyd

zex-hyd commented May 1, 2026

Copy link
Copy Markdown
Author

Hi @alexey-milovidov @rschu1ze! Could you add the pr-performance label to this PR?

Without the pr-performance label we don’t get the amd performance report / all-query-metrics.tsv. We need that to compare our new cuckoo_filter_.xml runs against the existing bloom_filter_.xml and to document the time + secondary index size follow-up to the earlier review!

@zex-hyd

zex-hyd commented May 2, 2026

Copy link
Copy Markdown
Author

@zex-hyd We must only introduce new index types to ClickHouse if they are strictly better than the current ones in terms of performance and storage space. Please evaluate the Cuckoo filter indexes vs. Bloom filter indexes (ideally as systematically as possible) along these dimensions, you can use functional tests (tests/queries/0_stateless), SQL performance tests (tests/performance) or C++ Google Benchmark tests for this.

@alexey-milovidov @rschu1ze Could you add the pr-performance label to this PR? Without the pr-performance label we don’t get the amd performance report / all-query-metrics.tsv.

@rschu1ze rschu1ze added the pr-performance Pull request with some performance improvements label May 3, 2026
Comment thread docs/en/engines/table-engines/mergetree-family/mergetree.md Outdated
Comment thread src/Interpreters/BinaryFuseFilter.cpp Outdated
Comment thread src/Interpreters/BinaryFuseFilter.cpp
Comment thread CHANGELOG.md
@clickhouse-gh clickhouse-gh Bot added pr-experimental Experimental Feature and removed pr-feature Pull request with new product feature labels Jun 13, 2026
Comment thread tests/queries/0_stateless/04104_cuckoo_filter_benchmark.sql Outdated
@zex-hyd zex-hyd force-pushed the feature/cuckoo-filter-skip-index branch from b2a3847 to 01abec6 Compare June 14, 2026 20:27
Comment thread tests/queries/0_stateless/04103_cuckoo_filter_vs_bloom_filter_equivalence.sql Outdated
Comment thread src/Interpreters/BinaryFuseFilter.cpp
@clickhouse-gh

clickhouse-gh Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.30% 85.30% +0.00%
Functions 92.60% 92.60% +0.00%
Branches 77.50% 77.40% -0.10%

Changed lines: Changed C/C++ lines covered by tests: 1278/1854 (68.93%) | Lost baseline coverage: none · Uncovered code

Full report · Diff report

alexey-milovidov and others added 4 commits June 29, 2026 11:51
…y_fuse_filter

Addresses the review request that the `JSONAllPaths` predicate contract be
covered for the two new experimental skip-index types. The existing
`04319_experimental_skip_index_json_isnotnull_nested` test only exercises the
`isNotNull(json.path)` path-presence predicate; this adds the equality
predicate (`json.path = value`) that the docs promise has the same coverage as
`bloom_filter`.

The test pins `index_granularity = 1` and `GRANULARITY 1` and, for each of
`cuckoo_filter` and `binary_fuse_filter`, checks that a `force_data_skipping_indices`
result equals the corresponding full scan for:
- a present path with a non-default value (index is applicable),
- an absent path (all granules can be skipped),
- a present path via CAST to a non-nullable type, and
- the default-value case `json.path::Int64 = 0`, where a missing path yields the
  type default and the index must NOT prune the granule.

This is a false-negative-sensitive branch: a wrong decision would make
`force_data_skipping_indices` drop granules where the path is present or absent.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

alexey-milovidov commented Jul 3, 2026

Copy link
Copy Markdown
Member

Merged current master into the branch to clear the merge conflict, and added 04520_experimental_skip_index_json_equality — the JSONAllPaths equality coverage the review asked for. The existing 04319_experimental_skip_index_json_isnotnull_nested only exercises the isNotNull(json.path) path-presence predicate; this adds the equality predicate (json.path = value). For both cuckoo_filter and binary_fuse_filter it pins index_granularity = 1 / GRANULARITY 1 and checks that a force_data_skipping_indices scan equals the full scan for:

  • a present path with a non-default value,
  • an absent path,
  • a present path via CAST to a non-nullable type, and
  • the default-value case json.path::Int64 = 0, where a missing path yields the type default so the index must not prune the granule.

The only remaining non-green check, Stress test (amd_msan)Logical error: Block structure mismatch in A stream: different columns (STID: 0993-27f0), is a pre-existing AST-fuzzer flake on master (it fires on pull_request_number = 0 and many unrelated PRs), unrelated to this change (tracked in #108142).

The other AI-review item — refreshing the cuckoo_filter benchmark numbers in the description — is best left to you, since those come from your benchmark runs.

…ndex interface

The recent merge of `master` into this branch introduced a build failure:
`master` added a `StorageMetadataPtr metadata_snapshot` parameter as the first
argument of both the `IMergeTreeIndex` constructor and the index factory
`Creator`, but the new experimental `cuckoo_filter` and `binary_fuse_filter`
index types still used the old signatures, so `MergeTreeIndexCuckooFilter.cpp`,
`MergeTreeIndexBinaryFuseFilter.cpp` and `MergeTreeIndices.cpp` failed to
compile (`no matching constructor for initialization of 'IMergeTreeIndex'` and
`no viable conversion ... to 'Creator'`).

Thread `metadata_snapshot` through the constructors and creators of both index
types, mirroring the existing `MergeTreeIndexBloomFilter` / `bloomFilterIndexCreator`.

This is a purely mechanical adaptation to the changed interface; no behavior change.

Build failure:
https://s3.amazonaws.com/clickhouse-test-reports/PRs/101796/1d4cea41ba2bfba9c879ac0cc56c74637741a01c/fast_test/build_clickhouse/build_clickhouse.log

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@zex-hyd

zex-hyd commented Jul 3, 2026

Copy link
Copy Markdown
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors manual approve Manual approve required to run CI pr-experimental Experimental Feature pr-performance Pull request with some performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants