Respect date_time_overflow_behavior for numeric casts to Date/Date32/DateTime/Time#109299
Respect date_time_overflow_behavior for numeric casts to Date/Date32/DateTime/Time#109299groeneai wants to merge 10 commits into
Conversation
…DateTime/Time The numeric->Date/Date32/DateTime/Time transforms in ConvertImpl were instantiated with the compile-time constant default_date_time_overflow_behavior (Ignore) instead of the runtime date_time_overflow_behavior template parameter that FunctionConvert::executeImpl already threads in via its overflow-mode switch. This made the throw and saturate paths dead code: out-of-range values were silently clamped/passed through even with date_time_overflow_behavior='throw'. Thread the runtime parameter through the nine affected transform instantiations. Enabling the previously-uninstantiated Throw/Saturate template paths surfaced latent -Werror issues in their overflow checks (float-vs-integer-constant comparisons rounding the bound, and an unsigned-vs-signed comparison); the comparisons are now done in the double domain (the 32-bit-target bounds are exact as double) and in the FromType domain respectively. Default behavior (ignore) is unchanged; only the opt-in throw/saturate modes start working. The numeric->DateTime64/Time64 casts are handled separately in PR ClickHouse#101512. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
cc @Algunenano (assignee of #101131, requested by @PedroTadim) and @rschu1ze for review. This fixes the numeric-> |
|
Workflow [PR], commit [34a2cd5] Summary: ⏳
AI ReviewSummaryThis PR correctly wires Findings❌ Blockers
Tests
Final VerdictStatus:
|
…rflow Making date_time_overflow_behavior='throw' reachable for numeric casts also made the exception path reachable for float sources. That path formatted the rejected value with static_cast<Int64>(from), which is undefined behavior when from is a huge Float64, Inf, or NaN (it produced INT64_MIN garbage in the message). Add a small formatOutOfBoundsValue helper that widens floating sources to double and keeps integral sources as their own type, and use it in the Date and Date32 throw paths. NaN also bypasses every range comparison (all comparisons are false for NaN), so in throw mode it fell through to a narrowing static_cast and silently returned a garbage result instead of raising. Add an explicit throw-mode NaN guard for Date, DateTime and Time (Date32 already had one). Only the throw path for floating sources changes; ignore and saturate modes and integral sources are unchanged. Extend 04401 with huge Float64, Float32, Inf and NaN inputs under date_time_overflow_behavior='throw' for Date, Date32, DateTime and Time. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Follow-up commit 27fad8c addresses the bot's UB flag on the newly-reachable Pre-PR validation gate (click to expand)
Session id: cron:clickhouse-worker-slot-3:20260703-151400 |
The saturate/ignore paths for numeric casts to Date/Date32/DateTime/Time clamped with std::min AFTER narrowing the source to time_t. A source above INT64_MAX (UInt64/UInt128/UInt256) wrapped to a small or negative time_t (e.g. UInt64::max became -1), so CAST(18446744073709551615, 'Time') SETTINGS date_time_overflow_behavior='saturate' returned -00:00:01 and CAST(9223372036854775813::UInt64, 'DateTime') returned 1970 instead of the saturated maximum. A huge or infinite float hit the same narrowing, which is undefined behavior. Add a saturateToRange helper that compares the source against the bounds with accurate::greaterOp/lessOp (full precision across signedness and float) and only performs static_cast<time_t> once the value is proven to be in range. Apply it to every newly reachable saturate/ignore path: UInt64 -> DateTime/Time, the signed Int64/float -> DateTime/Time transforms, and the numeric -> Date/Date32 transforms (which also take wide unsigned sources). Mirrors the Time64 fix in 04050_time64_cast_uint64_no_wrap. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pre-PR validation gate (click to expand)
Session id: cron:clickhouse-worker-slot-3:20260703-163800 |
The Float* -> Date branch (ToDateTransformFromSecondsOrDays) skipped the day-num/timestamp split for NaN, because both `from < 0` and `from > DATE_LUT_MAX_DAY_NUM` are false for NaN, then fell through to `static_cast<UInt16>(from)`, which is undefined behavior for NaN. The sibling Date32 / DateTime / Time paths already handle this (Date32 has an explicit NaN guard, DateTime/Time clamp NaN through saturateToRange). Extend the existing throw-only NaN guard on the Date path to all modes: throw raises, saturate/ignore clamp to the minimum (1970-01-01), matching the siblings. Extends 04401 with NaN -> Date/Date32/DateTime/Time regressions in saturate and ignore modes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pre-PR validation gate (click to expand)
Session id: cron:clickhouse-worker-slot-1:20260703-190500 |
ToTimeTransformSigned (Int8/Int16/Int32 -> Time) stored the raw value and ignored the overflow setting, so an out-of-range Int32 stayed verbatim (toInt32(CAST(4000000::Int32, 'Time')) == 4000000) while the Int64/UInt64/float paths threw or clamped at 3599999. Give it the same bound handling as ToTimeTransform64Signed: throw raises VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE, saturate/ignore clamp to [-MAX_TIME_TIMESTAMP, MAX_TIME_TIMESTAMP]. Int8/Int16 always fit, so the range check is compiled out for them. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Addressed the Pre-PR validation gate (click to expand)
Session id: cron:clickhouse-worker-slot-2:20260703-195400 |
…me casts UInt32 and the wide integer types (Int128/UInt128/Int256/UInt256, plus BFloat16) were missing from the numeric to DateTime/Time dispatch block, so they fell through to convertNumericGeneral, which ignores date_time_overflow_behavior and truncates the value. As a result CAST(4000000::UInt32, 'Time') SETTINGS date_time_overflow_behavior='throw' kept 4000000 instead of raising, and a wide value such as UInt128::max truncated or wrapped negative on the generic path instead of saturating or throwing. Add these widths to the unsigned branch (ToDateTimeTransform64/ToTimeTransform64) and the signed/float branch (ToDateTimeTransform64Signed/ToTimeTransform64Signed). UInt8/UInt16 always fit both targets and stay on the generic path; UInt32 only overflows Time (its max equals the DateTime max exactly). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pre-PR validation gate (click to expand)
Session id: cron:clickhouse-worker-slot-1:20260703-204900 |
The 'throw' section relied on the old buggy behavior where a numeric cast toTime(9999999) with date_time_overflow_behavior='throw' silently stored the raw out-of-range value instead of raising. ClickHouse#109299 makes numeric->Time casts respect the setting, so that cast now throws (in throw mode) or clamps to the max visible Time (in saturate/ignore mode). Rework the section so it still verifies the original intent: a Time column can internally hold a value beyond the visible range, and Date+Time uses that internal value, so two identically-printed Times can produce different DateTime results. The demonstration now uses a Time column (which still carries the internal value) instead of the numeric cast, and the numeric-cast overflow is asserted under the new correct semantics (throw raises; saturate/ignore clamp to 3599999). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Pushed The Pre-PR validation gate (click to expand)
Session id: cron:clickhouse-worker-slot-3:20260703-222500 |
…eTime/Time The VALUES/INSERT expression path (and IN-clause set construction and string-comparison coercion) goes through convertFieldToType, whose DateTime/Time branches stored the raw numeric value regardless of date_time_overflow_behavior. This made numeric coercion inconsistent with toDateTime/toTime/CAST, which already honour the setting. Route the DateTime (UInt64/Int64 -> [0, 0xFFFFFFFF]) and Time (UInt64/Int64 -> [-3599999, 3599999]) Field branches through a helper that throws in 'throw' mode and clamps in 'saturate'/'ignore' mode, mirroring the cast transforms. Date/Date32 already range-check via convertNumericType. Plain integer literals in VALUES take the text serializer path instead (shared by all input formats, pre-existing), which is left unchanged here. Update 04092_date_plus_time so it no longer relies on the raw-store behaviour and extend 04401 with numeric VALUES expression regressions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pre-PR validation gate (click to expand)
Scope note: plain integer literals in Session id: cron:clickhouse-worker-slot-1:20260703-233700 |
…ric-date-casts Resolve conflict in src/Interpreters/convertFieldToType.cpp between this PR's date_time_overflow_behavior-aware numeric Field coercion and master's fix for STID 3993-36c1 (LOGICAL_ERROR on OPTIMIZE PARTITION for a Time-typed key, commits bb7f73c / 3701b73). For the Time branch, route the value through convertNumericType<Int32> first, so the result is the canonical Int64 Field that Time part loading produces (keeps getPartitionIDFromQuery from raising a LOGICAL_ERROR, and rejects values outside the Int32 storage range with ARGUMENT_OUT_OF_BOUND), then apply date_time_overflow_behavior against the visible Time range. Make coerceNumericFieldToDateTimeOrTime return the target's canonical Field representation (Int64 for Time, UInt64 for DateTime) for both in-range and clamped values so a clamped partition value still compares equal on read-back. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Fast test failed on 04401 and 04092: an out-of-range numeric VALUES
expression such as INSERT INTO t VALUES (9999999 + 0) into a Time/DateTime
column can surface the VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE error either as a
server error or as a client error, depending on async_insert (the client
parses VALUES data locally). The tests asserted { serverError ... }, which
does not match when the overflow is raised client-side, so the runner
reported "Expected server error code '321' but got no server error".
Switch those INSERT ... VALUES markers to { error ... }, which accepts an
error from either side (same approach as 00700_decimal_bounds). Pure SELECT
and CAST assertions keep { serverError } since they run entirely server-side.
Regenerate the 04092 reference for the echo-mode text change.
Also drop the const on the local time_field in convertFieldToType so it can
be moved on return, fixing the arm_tidy performance-no-automatic-move error
introduced by the master merge resolution.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Follow-up commit Fast test root cause: an out-of-range numeric VALUES expression such as arm_tidy root cause: the master-merge resolution introduced Pre-PR validation gate (click to expand)
Session id: cron:clickhouse-worker-slot-0:20260704-101700 |
There was a problem hiding this comment.
This helper only fixes the Int64/UInt64 slice of convertFieldToType, so the shared VALUES/IN coercion path is still inconsistent with the overflow-aware CAST path. evaluateConstantExpression can hand this function Float64, UInt128/UInt256, or Int128/Int256 fields, and those still miss both branches here and end in TYPE_MISMATCH instead of clamping or raising VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE. The sibling Date/Date32 branches below are unchanged as well, so INSERT INTO d VALUES (99999999999 + 0) still rejects and INSERT INTO d32 VALUES (999999999999 + 0) can still narrow through ColumnVector<Int32>::insert instead of respecting date_time_overflow_behavior.
Because ValuesBlockInputFormat, IN-set construction, and string comparison all share convertFieldToType, this PR still leaves a real split between numeric CAST and numeric constant coercion. These branches need the same overflow-aware conversion for all four targets, plus 04401 coverage for Date/Date32 and non-Int64/UInt64 DateTime/Time expressions.
There was a problem hiding this comment.
Thanks. I checked each case on the current head (34a2cd5); it splits three ways.
VALUES expressions (your ... + 0 reproducers) are already correct in all modes because they go through the analyzer's _CAST, which this PR made overflow-aware:
INSERT INTO d VALUES (99999999999 + 0)->2106-02-07(ignore/saturate), raisesVALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE(throw)INSERT INTO d32 VALUES (999999999999 + 0)->2299-12-31(ignore/saturate), raises (throw)Float64/UInt128expr ->DateTime/Timelikewise raise (throw) / clamp (saturate/ignore)
So the Date/Date32 cases here do not reject or narrow; they clamp/throw via the cast path.
The real gap is the IN-set / constant-coercion path (SetUtils, strict=true) for the non-Int64/UInt64 fields you list (Float64, UInt128/UInt256, Int128/Int256) which end in TYPE_MISMATCH. That is real, but it is pre-existing and orthogonal to this PR's contract: these branches only ever accepted UInt64/Int64, so even an in-range SELECT toDateTime(5) IN (5.0::Float64) throws TYPE_MISMATCH on master too. This PR neither introduced nor widened it.
Extending it is not a mechanical mirror of the Int64 path, because IN-set construction runs with strict=true, whose purpose is to exclude lossy/out-of-range members (see the Decimal handling). A clamp-based extension produces false positives: SELECT toDateTime(4294967295) IN (99999999999) already returns 1 on this branch, because the out-of-range literal clamps onto the boundary and compares equal (before this PR the branch returned the raw value, so that comparison was false). Doing float/wide-int "right" means choosing clamp-vs-exclude under strict, and ideally revisiting the same for the Int64 path, which shifts IN-membership semantics more broadly.
I would prefer not to change strict IN-membership semantics inside this overflow-behavior PR without a maintainer's call, since that is a separate, backward-compat-sensitive behavior. Happy to implement it here or as a follow-up per the preferred scope. This PR's stated scope is the numeric CAST/toX path (the DateTime64/Time64 half is #101512); the convertFieldToType change already added here closes the VALUES-expression side that the test would otherwise codify.

Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Fixed
date_time_overflow_behaviorbeing silently ignored when casting integer and float values toDate,Date32,DateTimeandTime. Out-of-range values now raiseVALUE_IS_OUT_OF_RANGE_OF_DATA_TYPEinthrowmode and clamp to the target boundary insaturatemode, instead of always being clamped/passed through regardless of the setting.Description
Fixes part of #101131.
FunctionConvert::executeImplalready selects the overflow mode at runtime and passes it toConvertImplas thedate_time_overflow_behaviortemplate parameter (via theswitch (settings.date_time_overflow_behavior)that instantiates theThrow/Ignore/Saturatevariants). However, the numeric->Date/Date32/DateTime/Timebranches insideConvertImpl::executeinstantiated their transforms with the compile-time constantdefault_date_time_overflow_behavior(which isIgnore) instead of that template parameter. TheThrowandSaturatecode paths in those transforms were therefore never instantiated:date_time_overflow_behavior='throw'did not throw, and'saturate'matched the legacyignoreclamping.This threads the runtime parameter through the nine affected instantiations. The sibling
Date<->DateTimeandDateTime64->Date/DateTime/Timebranches already used the template parameter and are unchanged.Enabling the previously-uninstantiated
Throw/Saturatetemplate paths surfaced latent-Werrorissues in their overflow checks that had never been compiled:-Wimplicit-const-int-float-conversion: afloat/doublesource compared against the integer boundMAX_DATETIME_TIMESTAMP/MAX_DATETIME64_TIMESTAMP. These bounds are exact asdouble, so the comparison is now done in thedoubledomain.-Wsign-compare: an unsigned source compared against the signedMAX_TIME_TIMESTAMPand its negative. The comparison is now done in theFromTypedomain, and the negative bound only applies to signed sources.Backward compatibility: the default
ignoremode is byte-for-byte unchanged (verified against existing tests02900_date_time_check_overflow,04050_time64_cast_uint64_no_wrap,03271_date_to_datetime_saturation,02462_number_to_datetype,01734_datetime64_from_float); only the opt-inthrow/saturatemodes start behaving as documented. No setting default changed, soSettingsChangesHistory.cppis not touched.The numeric->
DateTime64/Time64casts (the DateTime64/Time64 half of #101131) are handled separately in #101512, and are intentionally left untouched here.CI failure provenance: reported via ClickGap automated review in #101131 (found while testing fuzzers). Reproducer:
SET date_time_overflow_behavior='throw'; SELECT CAST(99999999999::Int64, 'DateTime');returned2106-02-07 06:28:15instead of raisingVALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE.