Fix Bad cast for Const-wrapped Variant with LowCardinality member in set index analysis#107373
Conversation
|
cc @Algunenano @Avogar — could you review this? It completes the |
|
Workflow [PR], commit [333de7f] Summary: ✅
AI ReviewSummaryThis PR fixes Final VerdictStatus: ✅ Approve |
d37269f to
41bc5b5
Compare
|
Re-pushed The C++ fix in |
…ant with LowCardinality member A constant of a Variant type that has a LowCardinality member, used as a prepared-set element for an IN predicate over a MergeTree skip index, crashed the server with "Bad cast from type DB::ColumnString to DB::ColumnLowCardinality" in KeyCondition::applyDeterministicDagToColumn (via tryPrepareSetIndexForIn -> tryPrepareSetColumnsForIndex). The set element column is materialized with IColumn::convertToFullIfNeeded. For a Const(Variant(..., LowCardinality(String))) it first strips the outer Const, then iterates the unwrapped ColumnVariant's members via forEachSubcolumn and converts each to full, stripping the inner LowCardinality. That bypasses ColumnVariant::convertToFullIfNeeded, which is overridden to a no-op precisely to keep the column consistent with DataTypeVariant (whose members are sorted by name and cannot follow recursive stripping). The set element type keeps the inner LowCardinality, so the Variant->String cast wrapper expected a ColumnLowCardinality member but received a ColumnString and aborted. Fix: in the default IColumn::convertToFullIfNeeded, when an outer wrapper is stripped, delegate to the unwrapped column's own convertToFullIfNeeded so its override is honored, instead of recursing into its subcolumns here. For columns without an override the delegate's default implementation performs the same subcolumn recursion, so behavior is unchanged. This completes the fix from issue ClickHouse#97854, which only protected a top-level ColumnVariant. The delegation is gated on the column actually being an outer wrapper (isConst/isReplicated/isSparse/lowCardinality), not on a pointer difference after the strip chain. ColumnArray::convertToFullColumnIfConst always allocates a fresh ColumnArray even for a plain array, so a pointer-difference guard made any array column delegate to an equivalent fresh array forever (stack overflow on array IN-sets via Set::insertFromColumns). Container columns now fall through to the normal subcolumn recursion as before. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
41bc5b5 to
4e46276
Compare
Pre-PR validation gate (follow-up: ColumnArray recursion fix, commit 4e46276)
Session id: cron:clickhouse-worker-slot-3:20260612-224000 |
|
CI finished on 4e46276. All red checks are pre-existing failures unrelated to this change (this PR only edits
The |
…-lowcardinality-set-index
LLVM Coverage Report
Changed lines: Changed C/C++ lines covered by tests: 9/9 (100.00%) | Lost baseline coverage: none · Uncovered code |
|
CI re-validated on HEAD The |
|
I don't like this solution. And current implementation of |

Related: #97854
Related: #107111
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Fix a
Bad cast from type DB::ColumnString to DB::ColumnLowCardinalitylogical error (server crash in debug builds) when a constant of aVarianttype that has aLowCardinalitymember is used as a set element for anINpredicate over aMergeTreeskip index.Description
Found by the server-side AST fuzzer on master (
Stress test (experimental, serverfuzz, arm_ubsan), commitc20a6032517588c99638787af78a476c8c40b3d0, 2026-06-12; first seen onamd_tsan2026-05-24). STID 4054-600e. Report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=c20a6032517588c99638787af78a476c8c40b3d0&name_0=MasterCI&name_1=Stress%20test%20%28experimental%2C%20serverfuzz%2C%20arm_ubsan%29Minimal reproducer (debug build, aborts the server):
Root cause: the set element column is materialized with
IColumn::convertToFullIfNeeded. For aConst(Variant(..., LowCardinality(String)))the default implementation first strips the outerConst, then iterates the unwrappedColumnVariant's members viaforEachSubcolumnand converts each to a full column, stripping the innerLowCardinality. That bypassesColumnVariant::convertToFullIfNeeded, which is overridden to a no-op (issue #97854) precisely to keep the column consistent withDataTypeVariant(whose members are sorted by name and cannot follow recursive stripping). The set element type, processed byrecursiveRemoveLowCardinality, keeps the innerLowCardinality. The resulting column/type mismatch makes theVariant -> Stringcast wrapper expect aColumnLowCardinalitymember but receive aColumnString, aborting inKeyCondition::applyDeterministicDagToColumn(viatryPrepareSetIndexForIn->tryPrepareSetColumnsForIndex).This is the prepared-set sibling of the path fixed in #107111 (which handled the
equals/tuple-literal path inKeyConditionbut could not fix an input that is already column/type-asymmetric on arrival). It also completes the fix for issue #97854, which added theColumnVariant/ColumnDynamicoverride but only protected a top-levelColumnVariant, not one reached after unwrapping an outerConst(orTuple/Array/etc.).Fix: in the default
IColumn::convertToFullIfNeeded, when an outer wrapper (Const/Replicated/Sparse/LowCardinality) is stripped, delegate to the unwrapped column's ownconvertToFullIfNeededso its override is honored, instead of recursing into its subcolumns here. For columns without an override the delegate's default implementation performs the same subcolumn recursion, so behavior is unchanged.Regression test
06673_set_index_variant_low_cardinalityadded: crashes the server on the pre-fix binary, passes with the fix. Verified the existingVariant/LowCardinality/set-index test families (including04329,03991,03987,04141) and a broader sweep ofconvertToFullIfNeededconsumers still pass with no behavioral change.