Fix LOGICAL_ERROR in HashJoin::buildAdditionalFilter with join_use_nulls and a non-equi JOIN condition by groeneai · Pull Request #107957 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix LOGICAL_ERROR in HashJoin::buildAdditionalFilter with join_use_nulls and a non-equi JOIN condition#107957

Open
groeneai wants to merge 4 commits into
ClickHouse:masterfrom
groeneai:fix-hashjoin-additional-filter-nullable-2508-30f6
Open

Fix LOGICAL_ERROR in HashJoin::buildAdditionalFilter with join_use_nulls and a non-equi JOIN condition#107957
groeneai wants to merge 4 commits into
ClickHouse:masterfrom
groeneai:fix-hashjoin-additional-filter-nullable-2508-30f6

Conversation

@groeneai

@groeneai groeneai commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix a LOGICAL_ERROR exception (IColumn::assertTypeEquality) when a JOIN with a non-equi condition in the ON clause (a residual/mixed filter, e.g. l.a = r.a AND l.b != r.b) is executed with join_use_nulls = 1 against a key-value-storage right table (EmbeddedRocksDB or a dictionary).

Description

Found by the stress fuzzer (CIDB STID 2508-30f6), reproduced deterministically locally.

No related open issue found (the long-standing assertTypeEquality STID-2508 issues cover other code paths and are closed; this HashJoin::buildAdditionalFilter path was not covered by any of them).

CI reports:

(Both are unrelated PRs that merely hit the pre-existing trunk bug; 0 occurrences on master.)

Reproducer

The crash requires a key-value-storage table on the right side of the join (EmbeddedRocksDB or a COMPLEX_KEY_HASHED dictionary). Memory/MergeTree right sides do not reproduce it (see root cause).

CREATE TABLE rl (k1 String, k2 String, val UInt32) ENGINE = EmbeddedRocksDB PRIMARY KEY (k1, k2);
CREATE TABLE rr (k1 String, k2 String, val UInt32) ENGINE = EmbeddedRocksDB PRIMARY KEY (k1, k2);
INSERT INTO rl VALUES ('foo','bar',1),('foo','baz',2),('qux','bar',3);
INSERT INTO rr VALUES ('foo','bar',10),('foo','baz',20),('xyz','abc',30);
SELECT COUNT(DISTINCT *) FROM rl AS l LEFT JOIN rr AS r ON (l.k1 = r.k1) AND (l.k2 != r.k2)
SETTINGS join_use_nulls = 1, join_algorithm = 'direct,hash', enable_analyzer = 1;

Aborts in debug/sanitizer builds with:
Logical error: '... typeid(*this) == typeid(rhs)' in IColumn::assertTypeEquality, via HashJoin::buildAdditionalFilter (HashJoinMethodsImpl.h).

Root cause

A non-equi ON conjunct (l.k2 != r.k2) becomes a residual (mixed) filter whose required-column types reference the raw right-table inputs: in JoinStepLogical the residual DAG folds away the toNullable(x) wrappers and keeps the bare non-Nullable Input node. With a key-value-storage right side, JoinStepLogicalLookup promotes that same right column to Nullable in the saved join block under join_use_nulls = 1 and an outer join. (Memory/MergeTree right sides have no lookup step, so the right header is never promoted and the bug does not trigger.)

In buildAdditionalFilter, the destination column is created from the residual filter's required type (ColumnString), then filled by insertFrom from the saved-block column (ColumnNullable). The typeid differs, so IColumn::assertTypeEquality aborts.

Fix

Reconcile the two nullability representations when filling the residual-filter input column, mirroring the existing join-output builders (AddedColumns::buildJoinGetOutput, AddedColumns::checkColumns): if the destination is Nullable and the source is not, use insertFromNotNullable; if the destination is not Nullable and the source is, insert from the source's nested column. Matched rows reaching the residual filter are never null (output nullability only applies to non-matched outer rows, which the residual filter does not evaluate), so inserting the nested value is correct.

Regression tests: 04338_hash_join_additional_filter_nullable (COMPLEX_KEY_HASHED dictionary, fasttest-compatible) and 03720_rocksdb_multicolumn_primary_key (EmbeddedRocksDB).

@groeneai

Copy link
Copy Markdown
Contributor Author

@groeneai

Copy link
Copy Markdown
Contributor Author

cc @vdimir @KochetovNicolai — could you review this? It fixes an IColumn::assertTypeEquality abort in HashJoin::buildAdditionalFilter: a non-equi ON conjunct (residual/mixed filter) reaches the saved join block as Nullable under join_use_nulls=1, while the residual-filter expression references the column as non-Nullable (the toNullable wrapper is folded away in JoinStepLogical), so the dest/src column typeids diverge. The fix reconciles the nullability the same way the sibling builders AddedColumns::buildJoinGetOutput/checkColumns already do.

@nikitamikhaylov nikitamikhaylov added the can be tested Allows running workflows for external contributors label Jun 19, 2026
@clickhouse-gh

clickhouse-gh Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [1b5b608]

Summary:

job_name test_name status info comment
Bugfix validation (functional tests, amd64) FAIL
03720_rocksdb_multicolumn_primary_key ERROR cidb
Server died FAIL cidb, issue
Bugfix validation (functional tests, aarch64) FAIL
03720_rocksdb_multicolumn_primary_key ERROR cidb
Server died FAIL cidb, issue
Upgrade check (amd_release) FAIL
Error message in clickhouse-server.log (see upgrade_error_messages.txt) FAIL cidb
Finish Workflow FAIL
python3 ./ci/jobs/scripts/workflow_hooks/new_tests_check.py FAIL

AI Review

Summary

This PR fixes the LOGICAL_ERROR in HashJoin::buildAdditionalFilter for the reported key-value storage + non-equi ON + join_use_nulls = 1 reproducer, and the added regressions cover that exception path well. I do not think the change is ready to approve as-is, because it fixes one runtime sink while leaving the underlying key-header invariant broken for key-value joins, which still changes hash-key representation and method selection under join_use_nulls.

Findings

⚠️ Majors

  • [src/Interpreters/HashJoin/HashJoinMethodsImpl.h:867-877] The new nullability bridge in HashJoin::buildAdditionalFilter suppresses the exception locally, but it does not restore the preexisting invariant that join_use_nulls only affects emitted outer-row columns, not the hash-join key header itself. HashJoin still builds right_table_keys from the right_sample_block header (HashJoin.cpp:252-269) and chooses its key map from those types (HashJoin.cpp:313-318). For a single LowCardinality(String) key, that matters immediately: tryGetLowCardinalityMethod rejects LowCardinality(Nullable(T)) (HashJoin.cpp:469-481), so a key-value join that should stay on the dictionary-aware hash map now falls back to the generic materialized path purely because JoinStepLogicalLookup has already baked join_use_nulls into the right header. That is the same logical/physical contract break called out in the open maintainer thread: the exception is gone, but the planner/header divergence is still observable in HashJoin behavior. Suggested fix: keep key columns in raw storage types through JoinStepLogical -> physical conversion and apply nullable wrappers only when materializing join output columns; then keep the local bridge only if a residual-filter sink still genuinely needs it.
Final Verdict
  • Status: ⚠️ Request changes
  • Minimum required actions: move the join_use_nulls wrapping for key-value joins out of the hash-key header / right_sample_block path so HashJoin sees the original key types again, then keep or drop the buildAdditionalFilter bridge based on whether any residual-filter type mismatch remains after that root fix.

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Jun 19, 2026
Comment thread tests/queries/0_stateless/03720_rocksdb_multicolumn_primary_key.sql
Comment thread tests/queries/0_stateless/04338_hash_join_additional_filter_nullable.sql Outdated
@groeneai

Copy link
Copy Markdown
Contributor Author

Follow-up commit 4884b84 fixes a CI failure on the Stateless tests (..., old analyzer, ...) variant.

The regression queries use a non-equi ON conjunct (l.k2 != r.k2), which only the new analyzer supports; the old analyzer rejects it with INVALID_JOIN_ON_EXPRESSION before the query runs. Pinned enable_analyzer = 1 on those queries (per-query in 03720, session-level in 04338) so the coverage runs under the new analyzer regardless of the runner's analyzer mode. The engine fix itself is analyzer-independent; only the non-equi test syntax requires the new analyzer.

Verified with the test runner under --client-option allow_experimental_analyzer=0: both tests fail (INVALID_JOIN_ON_EXPRESSION) before this commit and pass after; the default new-analyzer path is unchanged.

Session id: cron:clickhouse-worker-slot-0:20260619-210000

@groeneai

Copy link
Copy Markdown
Contributor Author

CI finished on 4884b848f4c. All build, functional, integration, fuzzer and stateless jobs pass; the old-analyzer Stateless failure from the prior commit is resolved by the enable_analyzer = 1 pin.

The remaining red is expected, not a defect:

  • Bugfix validation (functional tests) "Server died" on the master-HEAD run: the regression tests (03720, 04338) reproduce the LOGICAL_ERROR (IColumn::assertTypeEquality) on unfixed master, which aborts the debug server. That is the crash-repro working as designed; both tests pass on this PR's build.
  • CH Inc sync: internal mirror-sync infra job, not actionable.

Session id: cron:our-pr-ci-monitor:20260620-020000

@rschu1ze

Copy link
Copy Markdown
Member

@groeneai Your repro

CREATE TABLE l (k1 String, k2 String, val UInt32) ENGINE = Memory;
CREATE TABLE r (k1 String, k2 String, val UInt32) ENGINE = Memory;

INSERT INTO l VALUES ('foo','bar',1),('foo','baz',2),('qux','bar',3);
INSERT INTO r VALUES ('foo','bar',10),('foo','baz',20),('xyz','abc',30);

SELECT COUNT(DISTINCT *)
FROM l AS l LEFT JOIN r AS r
  ON (l.k1 = r.k1) AND (l.k2 != r.k2)
SETTINGS join_use_nulls = 1;

goes through without error. Please find an actual repro first.

@groeneai

Copy link
Copy Markdown
Contributor Author

@rschu1ze You are right, that Memory example does not reproduce. The PR description was misleading and I have corrected it.

The crash needs a key-value-storage table on the right side of the join (EmbeddedRocksDB or a COMPLEX_KEY_HASHED dictionary), not Memory/MergeTree. With a KV right side, JoinStepLogicalLookup promotes the right-side columns to Nullable in the saved join block under join_use_nulls = 1, while the residual-filter expression keeps the bare non-Nullable input type. buildAdditionalFilter then builds the destination column from the non-Nullable type and fills it from the Nullable saved-block column, so assertTypeEquality aborts. Memory/MergeTree right sides have no lookup step, so the right header is never promoted and the types never diverge.

Working repro (aborts on an unfixed debug/sanitizer build, returns 2 with the fix):

CREATE TABLE rl (k1 String, k2 String, val UInt32) ENGINE = EmbeddedRocksDB PRIMARY KEY (k1, k2);
CREATE TABLE rr (k1 String, k2 String, val UInt32) ENGINE = EmbeddedRocksDB PRIMARY KEY (k1, k2);
INSERT INTO rl VALUES ('foo','bar',1),('foo','baz',2),('qux','bar',3);
INSERT INTO rr VALUES ('foo','bar',10),('foo','baz',20),('xyz','abc',30);
SELECT COUNT(DISTINCT *) FROM rl AS l LEFT JOIN rr AS r ON (l.k1 = r.k1) AND (l.k2 != r.k2)
SETTINGS join_use_nulls = 1, join_algorithm = 'direct,hash', enable_analyzer = 1;

The fasttest-compatible regression test 04338_hash_join_additional_filter_nullable uses a COMPLEX_KEY_HASHED dictionary on the right side to cover the same path without EmbeddedRocksDB. Both that test and the EmbeddedRocksDB query in 03720_rocksdb_multicolumn_primary_key abort on the unfixed build and pass with the fix.

Comment on lines +868 to +873
/// The destination column type comes from the residual filter expression's required
/// columns, which reference the raw (non-promoted) right-table inputs. The saved
/// right block, however, may carry join-output nullability (join_use_nulls + outer
/// join). Reconcile the two nullability representations, like the sibling join-output
/// builders do (AddedColumns::buildJoinGetOutput, AddedColumns::checkColumns).
if (auto * nullable_dest = typeid_cast<ColumnNullable *>(col.get()); nullable_dest && !src_col->isNullable())

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally we keep original keys non nullable for join_use_nulls as join keys and output caster columns, it should be handled on planning stage or when we convert join logical step to physical

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vdimir Agreed, this belongs upstream. The seam:

For a regular (Memory/MergeTree) right side the saved right block stays non-nullable; join_use_nulls is applied only to the OUTPUT columns (AddedColumns builds them Nullable, reconciling via insertFromNotNullable). The residual/mixed-filter required columns also stay non-nullable, because JoinStepLogical folds away the toNullable wrappers (JoinStepLogical.cpp:1230-1249). Saved block and residual required types agree, so there is no divergence.

The key-value lookup path is the anomaly. JoinStepLogical.cpp:1336-1343:

if (logical_lookup && prepared_join_storage.storage_key_value)
    right_dag.mergeInplace(... actions_after_join, fromRight ...);

merges the toNullable-wrapped right columns into right_dag, so for KV joins only, right_sample_block (and the HashJoin saved block data->sample_block) becomes Nullable. The residual required types are still non-nullable, so buildAdditionalFilter builds the dest from the non-nullable type and fills from the Nullable saved block, tripping assertTypeEquality.

Planning-stage fix: keep the KV-storage right_sample_block non-nullable like regular joins and apply join_use_nulls on output, instead of baking it into the block the physical join consumes. That removes the divergence at the source and drops the buildAdditionalFilter reconciliation.

Before I rewrite: is keeping right_sample_block non-nullable in that KV branch the direction you want (this is also the logical->physical conversion you mentioned), or would you prefer it handled elsewhere? Confirming first since it touches the KV/direct-join output path. The current fix and the 03720/04338 regression tests stay on the branch until then.

groeneai and others added 4 commits July 4, 2026 11:27
A non-equi ON conjunct (e.g. l.k2 != r.k2) becomes a residual (mixed)
filter. Its required-column types reference the raw right-table inputs
(non-Nullable), because the residual filter DAG folds away the toNullable
wrappers in JoinStepLogical. Under join_use_nulls with an outer join,
however, the right column is stored in the saved join block as Nullable.

buildAdditionalFilter built the destination column from the residual
filter's required type (e.g. ColumnString) and then inserted from the
saved-block column (ColumnNullable), tripping IColumn::assertTypeEquality
and aborting in debug/sanitizer builds.

Reconcile the two nullability representations when filling the residual
filter input column, mirroring the existing join-output builders
(AddedColumns::buildJoinGetOutput and AddedColumns::checkColumns).
Matched rows reaching the residual filter are never null, so inserting
the nested value is correct.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ilter

The existing regression lives in 03720_rocksdb_multicolumn_primary_key, which
is tagged no-fasttest (EmbeddedRocksDB needs ENABLE_LIBRARIES), so it does not
run in fasttest/normal stateless builds.

The crash is not specific to EmbeddedRocksDB: it is triggered by any key-value
storage join (JoinStepLogicalLookup), where under join_use_nulls the right side
reaches the saved join block already promoted to Nullable while the residual
filter expression references the raw non-Nullable input. A complex-key
dictionary is also a key-value entity and reproduces the exact same
buildAdditionalFilter abort, but runs in fasttest. Plain Memory/MergeTree tables
do not promote the right header before the join, so they cannot reproduce it.

The new test uses a COMPLEX_KEY_HASHED dictionary on the right side and forces
join_algorithm = 'hash'. It is parallel-safe (objects scoped to the test
database). Verified to abort without the fix and pass with it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ents

A LOGICAL_ERROR is a thrown exception that release builds do not terminate
on; only debug/sanitizer builds abort on the assertion. Use exception wording
in the regression-test comments per ClickHouse convention.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The buildAdditionalFilter regression queries use a non-equi ON conjunct
(l.k2 != r.k2), which only the new analyzer supports. The old analyzer
rejects it with INVALID_JOIN_ON_EXPRESSION, so the test failed on the
old-analyzer CI variant. Pin enable_analyzer=1 on those queries
(03720) and the dictionary test (04338) so coverage runs under the new
analyzer regardless of the runner's analyzer mode.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai groeneai force-pushed the fix-hashjoin-additional-filter-nullable-2508-30f6 branch from 4884b84 to 1b5b608 Compare July 4, 2026 11:35
@clickhouse-gh

clickhouse-gh Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.60% 85.60% +0.00%
Functions 92.70% 92.70% +0.00%
Branches 77.80% 77.70% -0.10%

Changed lines: Changed C/C++ lines covered: 8/9 (88.89%) · Uncovered code

Full report · Diff report

@groeneai

groeneai commented Jul 4, 2026

Copy link
Copy Markdown
Contributor Author

CI finish ledger — 1b5b608

Every failure below has an owner: a fixing PR (ours or external), or a full-effort fix task whose fixing-PR link will be posted here when it opens. Only CH Inc sync is exempt.

Check / test Reason Owner / fixing PR
Bugfix validation (functional tests, amd64) / 03720_rocksdb_multicolumn_primary_key + Server died expected master-side crash-repro — the regression test this PR adds reproduces the IColumn::assertTypeEquality LOGICAL_ERROR on the UNFIXED master binary (bugfix-validation runs the new test against master, which lacks this fix, so the server aborts by design). The PR's own binary does not crash. this PR (the fix); resolves on merge
Bugfix validation (functional tests, aarch64) / 03720_rocksdb_multicolumn_primary_key + Server died same as above (aarch64 variant) this PR (the fix); resolves on merge
Upgrade check (amd_release) / Error message in clickhouse-server.log flaky (benign DDLWorker(rdb_test_*) ... Mapping for table with UUID=... already exists (TABLE_ALREADY_EXISTS) Replicated-DB recovery log-noise on the upgrade restart; not caused by this HashJoin change) #107126 (ours, open)
Finish Workflow / Post Hooks rollup of the above covered by the lines above

CH Inc sync = SUCCESS. No failure is caused by this PR's HashJoinMethodsImpl.h change; the Bugfix-validation "Server died" is the intended demonstration that the new regression test catches the fixed LOGICAL_ERROR on unfixed master.

Session id: cron:our-pr-ci-monitor:20260704-150000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants