Fix abort casting Array(Dynamic)/Array(Variant) to QBit with accurateCastOrNull by groeneai · Pull Request #108288 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix abort casting Array(Dynamic)/Array(Variant) to QBit with accurateCastOrNull#108288

Merged
alexey-milovidov merged 2 commits into
ClickHouse:masterfrom
groeneai:fix-qbit-dynamic-accurate-cast-null
Jun 23, 2026
Merged

Fix abort casting Array(Dynamic)/Array(Variant) to QBit with accurateCastOrNull#108288
alexey-milovidov merged 2 commits into
ClickHouse:masterfrom
groeneai:fix-qbit-dynamic-accurate-cast-null

Conversation

@groeneai

@groeneai groeneai commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Related: #107951 (AST fuzzer, site 13)

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix a server abort (Logical error in IColumn::insertFrom) when casting an Array(Dynamic) or Array(Variant) to QBit with accurateCastOrNull, e.g. accurateCastOrNull(CAST(range(114), 'Array(Dynamic)'), 'QBit(Float32, 114)').

Description

accurateCastOrNull(arr, 'QBit(...)') aborted in IColumn::insertFrom (assertTypeEquality) when the source array element type was Dynamic or Variant.

Root cause: under accurateOrNull, the per-variant element conversion from Dynamic/Variant returns a Nullable column (overflow is reported as NULL), but createArrayToQBitWrapper drove that conversion with a non-nullable QBit element target. ConvertImplFromDynamicToColumn::execute then built a non-nullable result column and called res->insertFrom() on the Nullable per-variant column, failing the structural type-equality assertion. QBit is not rejected by validateNestedTypesForAccurateCastOrNull (it reports canBeInsideNullable() == true, unlike Array/Map), so the cast reaches the wrapper instead of erroring out early.

Fix: when the source nested type is Dynamic/Variant and cast_type is accurateOrNull, request a Nullable nested target so the conversion assembles a consistent column; the existing removeNullable() then strips it before bit transposition (QBit elements are non-nullable). Covers both Dynamic and Variant sources.

Found by the AST fuzzer (issue #107951, site 13). The crash reproduces via plain SQL on master HEAD; the analyzer constant-folds the cast and aborts during executeDryRunImpl.

CI report (STID 2508-50d4): https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=107454&sha=223bf5ceb23779637a985737f414a20dbbae7a7a&name_0=PR&name_1=Stress%20test%20%28arm_asan_ubsan%29

Version info

  • Merged into: 26.6.1.1188

…CastOrNull

accurateCastOrNull(arr, 'QBit(...)') aborted (assertTypeEquality in
IColumn::insertFrom) when the source array element type was Dynamic or
Variant, e.g.

    SELECT accurateCastOrNull(CAST(range(114), 'Array(Dynamic)'), 'QBit(Float32, 114)')

Under accurateOrNull, the per-variant element conversion from Dynamic/Variant
returns a Nullable column (overflow is reported as NULL), but
createArrayToQBitWrapper drove that conversion with a non-nullable QBit element
target. ConvertImplFromDynamicToColumn::execute then built a non-nullable result
column and called res->insertFrom() on the Nullable per-variant column, failing
the structural type-equality assertion.

QBit is not rejected by validateNestedTypesForAccurateCastOrNull (it reports
canBeInsideNullable() == true), unlike Array/Map, so the cast reaches the
wrapper instead of erroring early.

Fix: when the source nested type is Dynamic/Variant and cast_type is
accurateOrNull, request a Nullable nested target so the conversion assembles a
consistent column; the existing removeNullable() then strips it before bit
transposition (QBit elements are non-nullable). Covers both Dynamic and Variant
sources.

Found by the AST fuzzer (issue ClickHouse#107951, site 13).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

Copy link
Copy Markdown
Contributor Author

@groeneai

Copy link
Copy Markdown
Contributor Author

cc @rschu1ze @Avogar for review. Small fix in the CAST machinery (createArrayToQBitWrapper): accurateCastOrNull(Array(Dynamic|Variant), 'QBit(...)') aborted because the per-variant element conversion returns Nullable under accurateOrNull while the QBit element target was non-nullable. Found by the AST fuzzer (#107951, site 13); reproduces via plain SQL on master. @Avogar since the root cause is the Dynamic/Variant element conversion path.

@clickhouse-gh

clickhouse-gh Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [82126ec]

Summary:


AI Review

Summary

This PR fixes accurateCastOrNull from Array(Dynamic) / Array(Variant) to QBit by requesting a nullable nested conversion result, preserving element conversion failures as a per-row null map before transposing to QBit, and adding focused stateless coverage for the original exception and the null-propagation edge cases. I did not find remaining correctness issues in the current diff.

Missing context / blind spots
  • ⚠️ I did not run the stateless test locally. The S3 PR report available from the bot comment currently contains no finished test results; a passing 03380_accurate_cast_or_null_qbit run or the PR functional test job would close this gap.
Final Verdict

Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Jun 23, 2026
Comment thread src/Functions/FunctionsConversion.cpp
The previous fix added a Nullable nested target so the Dynamic/Variant to
QBit element conversion assembles a consistent column, then stripped it with
removeNullable() before bit transposition (QBit elements are non-nullable).
That dropped the per-element null map: a nested element that fails accurate
conversion was marked NULL inside the nested column, but removeNullable()
exposed its default value and the QBit row came back non-NULL. That violates
accurateCastOrNull's contract (an inaccurate element must make the whole outer
value NULL, as createTupleWrapper does for tuple elements).

Aggregate the nested element null map per array row (any NULL element makes the
row NULL) before stripping it, and wrap the returned QBit column with that
row-level null map. wrapInNullable in prepareRemoveNullable then merges it with
the source null map. QBit elements cannot hold NULL, so this mirrors
createTupleWrapper's non-Nullable-target branch. Covers both Dynamic and Variant
sources.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

Copy link
Copy Markdown
Contributor Author

Pre-PR validation gate

# Question Answer
a Deterministic repro? Yes. accurateCastOrNull(['abc','1','2']::Array(Dynamic), 'QBit(Float32, 3)') returned [0,1,2] (non-null) on the prior PR head; same for Array(Variant(String,Int64)), a source-NULL element, and a Float32-precision-overflowing Int64.
b Root cause explained? The prior fix used a Nullable nested target, then removeNullable(converted_nested) stripped the per-element null map before bit transposition. An element that failed accurate conversion was flagged NULL in the nested column, but after stripping its default value was exposed and convertArrayToQBit returned a non-null QBit. The outer wrapInNullable found no source null map (Array source is not Nullable), so it wrapped with an all-false map. The per-element failure was silently lost.
c Fix matches root cause? Yes. Before removeNullable, the inner element null map is aggregated per array row (any NULL element in a row -> row NULL); the QBit column is then wrapped in ColumnNullable with that row null map. wrapInNullable merges it with the source null map. QBit elements are non-nullable, so this mirrors createTupleWrapper's non-Nullable-target branch (any failed element makes the whole value NULL). No symptom band-aid.
d Test intent preserved / new tests added? Existing assertions (abort-free conversion, happy-path output, SIZES_OF_ARRAYS_DONT_MATCH) are unchanged. Added 6 regression cases to 03380_accurate_cast_or_null_qbit asserting NULL for an inaccurate/NULL/overflowing element (Dynamic and Variant), non-NULL for a fully convertible array, and per-row correctness.
e Both directions demonstrated? Yes. Prior head: bug cases returned [0,1,2] (IS NULL = 0). With the fix: all return NULL; the fully-convertible array stays [1,2,3]; multi-row marks only the failing rows NULL. The full test passes via clickhouse-test (20/20 with randomization).
f Fix is general, not a narrow patch? Covers both sibling source paths (Dynamic and Variant) and all three QBit element sizes (BFloat16/Float32/Float64, all routed through this one templated wrapper). The fix is at the point where the null map is produced and lost, not a guard at the crash site. Tuple targets already handle this via createTupleWrapper; Array/Map targets are rejected earlier by validateNestedTypesForAccurateCastOrNull (cannot be inside Nullable), so QBit was the remaining nested-target path.

Session id: cron:clickhouse-worker-slot-4:20260623-124200

@clickhouse-gh

clickhouse-gh Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

📊 Cloud Performance Report

✅ AI verdict: no_change — no significant changes across 38 queries analysed

no significant changes detected. K_source=6 K_base=30 flagged=0/65

clickbench

🟢 No significant changes

tpch_adapted_1_official

🟢 No significant changes

Debug info
  • StressHouse run: 3da39477-73d4-48b3-93df-2fa0594351e3
  • MIRAI run: b12c2131-168c-45ec-9144-1542f42f0223
  • PR check IDs:
    • clickbench_151011_1782231555
    • clickbench_151017_1782231555
    • clickbench_151028_1782231556
    • tpch_adapted_1_official_151035_1782231555
    • tpch_adapted_1_official_151046_1782231556
    • tpch_adapted_1_official_151059_1782231556

@clickhouse-gh

clickhouse-gh Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.40% 85.30% -0.10%
Functions 92.50% 92.50% +0.00%
Branches 77.60% 77.60% +0.00%

Changed lines: Changed C/C++ lines covered by tests: 37/37 (100.00%) | Lost baseline coverage: none · Uncovered code

Full report · Diff report

@alexey-milovidov alexey-milovidov self-assigned this Jun 23, 2026
@alexey-milovidov alexey-milovidov added this pull request to the merge queue Jun 23, 2026
Merged via the queue into ClickHouse:master with commit 419eea3 Jun 23, 2026
328 of 330 checks passed
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jun 24, 2026
alexey-milovidov added a commit that referenced this pull request Jun 24, 2026
Add changelog entries for PRs that landed on the 26.6 release line after
the first pass: #82414, #108042 (New Feature), #107428 (Improvement),
and #108128, #108288, #100205, #108029 (Bug Fix).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-bugfix Pull request with bugfix, not backported by default pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants