Respect date_time_overflow_behavior for numeric casts to Date/Date32/DateTime/Time by groeneai · Pull Request #109299 · ClickHouse/ClickHouse · GitHub
Skip to content

Respect date_time_overflow_behavior for numeric casts to Date/Date32/DateTime/Time#109299

Open
groeneai wants to merge 10 commits into
ClickHouse:masterfrom
groeneai:groeneai/fix-101131-overflow-behavior-numeric-date-casts
Open

Respect date_time_overflow_behavior for numeric casts to Date/Date32/DateTime/Time#109299
groeneai wants to merge 10 commits into
ClickHouse:masterfrom
groeneai:groeneai/fix-101131-overflow-behavior-numeric-date-casts

Conversation

@groeneai

@groeneai groeneai commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fixed date_time_overflow_behavior being silently ignored when casting integer and float values to Date, Date32, DateTime and Time. Out-of-range values now raise VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE in throw mode and clamp to the target boundary in saturate mode, instead of always being clamped/passed through regardless of the setting.

Description

Fixes part of #101131.

FunctionConvert::executeImpl already selects the overflow mode at runtime and passes it to ConvertImpl as the date_time_overflow_behavior template parameter (via the switch (settings.date_time_overflow_behavior) that instantiates the Throw/Ignore/Saturate variants). However, the numeric->Date/Date32/DateTime/Time branches inside ConvertImpl::execute instantiated their transforms with the compile-time constant default_date_time_overflow_behavior (which is Ignore) instead of that template parameter. The Throw and Saturate code paths in those transforms were therefore never instantiated: date_time_overflow_behavior='throw' did not throw, and 'saturate' matched the legacy ignore clamping.

This threads the runtime parameter through the nine affected instantiations. The sibling Date<->DateTime and DateTime64->Date/DateTime/Time branches already used the template parameter and are unchanged.

Enabling the previously-uninstantiated Throw/Saturate template paths surfaced latent -Werror issues in their overflow checks that had never been compiled:

  • -Wimplicit-const-int-float-conversion: a float/double source compared against the integer bound MAX_DATETIME_TIMESTAMP / MAX_DATETIME64_TIMESTAMP. These bounds are exact as double, so the comparison is now done in the double domain.
  • -Wsign-compare: an unsigned source compared against the signed MAX_TIME_TIMESTAMP and its negative. The comparison is now done in the FromType domain, and the negative bound only applies to signed sources.

Backward compatibility: the default ignore mode is byte-for-byte unchanged (verified against existing tests 02900_date_time_check_overflow, 04050_time64_cast_uint64_no_wrap, 03271_date_to_datetime_saturation, 02462_number_to_datetype, 01734_datetime64_from_float); only the opt-in throw/saturate modes start behaving as documented. No setting default changed, so SettingsChangesHistory.cpp is not touched.

The numeric->DateTime64/Time64 casts (the DateTime64/Time64 half of #101131) are handled separately in #101512, and are intentionally left untouched here.

CI failure provenance: reported via ClickGap automated review in #101131 (found while testing fuzzers). Reproducer: SET date_time_overflow_behavior='throw'; SELECT CAST(99999999999::Int64, 'DateTime'); returned 2106-02-07 06:28:15 instead of raising VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE.

…DateTime/Time

The numeric->Date/Date32/DateTime/Time transforms in ConvertImpl were
instantiated with the compile-time constant default_date_time_overflow_behavior
(Ignore) instead of the runtime date_time_overflow_behavior template parameter
that FunctionConvert::executeImpl already threads in via its overflow-mode
switch. This made the throw and saturate paths dead code: out-of-range values
were silently clamped/passed through even with date_time_overflow_behavior='throw'.

Thread the runtime parameter through the nine affected transform instantiations.
Enabling the previously-uninstantiated Throw/Saturate template paths surfaced
latent -Werror issues in their overflow checks (float-vs-integer-constant
comparisons rounding the bound, and an unsigned-vs-signed comparison); the
comparisons are now done in the double domain (the 32-bit-target bounds are exact
as double) and in the FromType domain respectively.

Default behavior (ignore) is unchanged; only the opt-in throw/saturate modes
start working. The numeric->DateTime64/Time64 casts are handled separately in
PR ClickHouse#101512.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

groeneai commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

@groeneai

groeneai commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

cc @Algunenano (assignee of #101131, requested by @PedroTadim) and @rschu1ze for review.

This fixes the numeric->Date/Date32/DateTime/Time half of #101131: those ConvertImpl branches hardcoded default_date_time_overflow_behavior (Ignore) instead of the runtime template parameter that the executeImpl overflow-mode switch already provides, so throw/saturate were dead code. The numeric->DateTime64/Time64 half is already handled by #101512, so this PR intentionally leaves those six instantiations untouched.

@clickhouse-gh

clickhouse-gh Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [34a2cd5]

Summary:

job_name test_name status info comment
Upgrade check (amd_release) FAIL
Error message in clickhouse-server.log (see upgrade_error_messages.txt) FAIL cidb
Performance Comparison (arm_release, master_head, 4/6) FAIL Performance dashboard
Check Results FAIL

AI Review

Summary

This PR correctly wires date_time_overflow_behavior through the main numeric CAST transforms and fixes several real UB/wraparound cases, but the follow-up work in convertFieldToType only covers part of the constant-coercion surface. The shared VALUES/IN/comparison path is still inconsistent with CAST for some numeric sources and targets, so the contract this PR is establishing is not complete yet.

Findings

❌ Blockers

  • [src/Interpreters/convertFieldToType.cpp:405-427] convertFieldToType now fixes only the Int64/UInt64 DateTime/Time branches, but ValuesBlockInputFormat can still hand it Float64, UInt128/UInt256, or Int128/Int256 constants. Those widths still miss the new helper and fall out with TYPE_MISMATCH instead of clamping or raising VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE, while the sibling Date/Date32 branches below still use raw convertNumericType / raw Int64 and therefore do not respect date_time_overflow_behavior either. In particular, INSERT INTO d VALUES (99999999999 + 0) still rejects instead of matching CAST(..., 'Date'), and INSERT INTO d32 VALUES (999999999999 + 0) can still narrow through ColumnVector<Int32>::insert instead of saturating or throwing.
    Suggested fix: route all numeric Date/Date32/DateTime/Time field coercions through the same overflow-aware logic as FunctionsConversion.h, then extend 04401 with VALUES regressions for Date/Date32 plus non-Int64/UInt64 DateTime/Time expressions.

⚠️ Majors

  • [src/DataTypes/Serializations/SerializationTime.cpp:102-115, src/DataTypes/Serializations/SerializationDateTime.cpp:178-193] [dismissed by author -- https://github.com/ClickHouse/ClickHouse/pull/109299#discussion_r3522124137] Plain numeric VALUES literals still bypass the setting entirely: these deserializers read bare integers with readIntText / readAsIntText and push the raw value straight into the column. So SET date_time_overflow_behavior='throw'; INSERT INTO t VALUES (9999999) continues to preserve the out-of-range Time / DateTime payload instead of raising or clamping. I still consider this real because it leaves the same user-visible operation (INSERT ... VALUES) with two contradictory overflow rules depending only on whether the literal was written as 9999999 or 9999999 + 0.
    Suggested fix: thread date_time_overflow_behavior into the text deserializers too, or reject the bare-integer fast path until a dedicated follow-up lands.
Tests
  • ⚠️ tests/queries/0_stateless/04401_date_time_overflow_behavior_numeric_casts.sql proves only the Int64/UInt64 VALUES expression cases for Time/DateTime. Add focused VALUES regressions for Date/Date32 and for Float64/wide-int Time/DateTime expressions so the shared convertFieldToType path is actually covered end to end.
Final Verdict

Status: ⚠️ Request changes

  • Extend convertFieldToType so numeric constant coercion follows the same overflow contract as CAST for all four targets, not only the current Int64/UInt64 subset.
  • Decide how to handle plain numeric VALUES literals; they still bypass the setting today, and the current PR leaves that inconsistency user-visible.

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Jul 3, 2026
Comment thread src/Functions/FunctionsConversion.h
…rflow

Making date_time_overflow_behavior='throw' reachable for numeric casts also
made the exception path reachable for float sources. That path formatted the
rejected value with static_cast<Int64>(from), which is undefined behavior when
from is a huge Float64, Inf, or NaN (it produced INT64_MIN garbage in the
message). Add a small formatOutOfBoundsValue helper that widens floating
sources to double and keeps integral sources as their own type, and use it in
the Date and Date32 throw paths.

NaN also bypasses every range comparison (all comparisons are false for NaN),
so in throw mode it fell through to a narrowing static_cast and silently
returned a garbage result instead of raising. Add an explicit throw-mode NaN
guard for Date, DateTime and Time (Date32 already had one). Only the throw
path for floating sources changes; ignore and saturate modes and integral
sources are unchanged.

Extend 04401 with huge Float64, Float32, Inf and NaN inputs under
date_time_overflow_behavior='throw' for Date, Date32, DateTime and Time.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

groeneai commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Follow-up commit 27fad8c addresses the bot's UB flag on the newly-reachable throw path.

Pre-PR validation gate (click to expand)
# Question Answer
a Deterministic repro? Yes. On PR HEAD (eabd3b2), SELECT CAST(1e300::Float64, 'Date') SETTINGS date_time_overflow_behavior='throw' reports Value -9223372036854775808 is out of bounds (INT64_MIN garbage from static_cast<Int64>(1e300) UB); CAST(nan::Float64, 'Date'/'DateTime'/'Time') under throw silently returns 1970-01-01/-999:59:59 with no exception.
b Root cause explained? Enabling the runtime throw path made it reachable for float sources. Its message formatted the rejected value with static_cast<Int64>(from), which is UB for huge/Inf/NaN floats. Separately, NaN passes every range comparison (all comparisons are false for NaN), so in throw mode it fell through to the narrowing cast and silently returned garbage instead of raising.
c Fix matches root cause? Yes. A formatOutOfBoundsValue helper widens floating sources to double (never narrows) and keeps integral sources as their own type; applied to the Date/Date32 throw messages. Explicit throw-mode NaN guards added for Date/DateTime/Time (Date32 already had one).
d Test intent preserved / new tests added? Yes. 04401 extended with huge Float64/Float32, Inf, -Inf, NaN under throw for Date/Date32/DateTime/Time, all asserting VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE. No existing assertions weakened.
e Both directions demonstrated? Yes. Baseline binary (id 3f7141b3) reproduces the garbage message and the silent NaN pass; fixed binary (id cd10b80a) raises clean errors showing the real value (1e+300, inf, nan). Test passes 20/20 with randomization.
f Fix is general across code paths? Yes. Audited every numeric->date/time throw branch this PR instantiates: only Date and Date32 narrowed via static_cast<Int64>; the DateTime/Time throw branches already format from directly. Verified zero narrowing casts remain in any out-of-bounds message. The NaN silent-pass was fixed in all three float-reachable transforms that lacked a guard.
g Fix generalizes across inputs? Yes. Helper is type-generic: floats (Float32/Float64/BFloat16, which has no fmt formatter) -> double; integral incl. Int8 and wide ints (Int128..UInt256, which Date/Date32 also instantiate) keep their own type via their existing fmt formatters. Test matrix covers huge/Inf/-Inf/NaN across all four target types and Float32+Float64.
h Backward compatible? Yes. Change is confined to throw mode for floating sources. ignore (default) and saturate are unchanged (verified: NaN/huge float still clamp to 1970-01-01/2079-06-07, no throw), and integral throw messages are unchanged. No setting default or on-disk/wire format change.
i Invariants and contracts preserved? Yes. Transforms still return the same values on all non-throwing paths; only the exception message formatting and the NaN-in-throw-mode behavior changed. NO_SANITIZE_UNDEFINED retained. The helper is a pure formatting shim with no side effects.

Session id: cron:clickhouse-worker-slot-3:20260703-151400

Comment thread src/Functions/FunctionsConversion.h
The saturate/ignore paths for numeric casts to Date/Date32/DateTime/Time
clamped with std::min AFTER narrowing the source to time_t. A source above
INT64_MAX (UInt64/UInt128/UInt256) wrapped to a small or negative time_t
(e.g. UInt64::max became -1), so CAST(18446744073709551615, 'Time')
SETTINGS date_time_overflow_behavior='saturate' returned -00:00:01 and
CAST(9223372036854775813::UInt64, 'DateTime') returned 1970 instead of the
saturated maximum. A huge or infinite float hit the same narrowing, which is
undefined behavior.

Add a saturateToRange helper that compares the source against the bounds with
accurate::greaterOp/lessOp (full precision across signedness and float) and
only performs static_cast<time_t> once the value is proven to be in range.
Apply it to every newly reachable saturate/ignore path: UInt64 -> DateTime/Time,
the signed Int64/float -> DateTime/Time transforms, and the numeric -> Date/Date32
transforms (which also take wide unsigned sources). Mirrors the Time64 fix in
04050_time64_cast_uint64_no_wrap.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

groeneai commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author
Pre-PR validation gate (click to expand)
# Question Answer
a Deterministic repro? Yes. CAST(18446744073709551615::UInt64, 'Time') SETTINGS date_time_overflow_behavior='saturate' returns -00:00:01 on the pre-fix binary (should be 999:59:59); CAST(9223372036854775813::UInt64, 'DateTime') SETTINGS date_time_overflow_behavior='saturate' returns 1970-01-01 00:00:05 (should be 2106-02-07 06:28:15). Fires every time.
b Root cause explained? The saturate/ignore paths clamp with std::min(time_t(from), MAX) AFTER narrowing from to time_t. time_t is signed 64-bit, so an unsigned source above INT64_MAX (UInt64/UInt128/UInt256) wraps to a small/negative value (static_cast<time_t>(UInt64::max) == -1) before the min, so the clamp keeps the wrapped value instead of the maximum. A huge or infinite float narrowed the same way, which is undefined behavior.
c Fix matches root cause? Yes. saturateToRange compares from against the bounds with accurate::greaterOp/lessOp (full precision across signedness and float) and only runs static_cast<time_t> once the value is proven in range, so the narrowing can no longer wrap, truncate, or hit UB. It addresses the ordering, not the symptom.
d Test intent preserved / new tests added? Yes. 04401 gains 14 saturate + 2 ignore assertions for sources above INT64_MAX and float extremes (UInt64::max, 2^63+5, UInt128::max, 1e300, 3e38, inf, -inf, nan) across Date/Date32/DateTime/Time, all asserting the clamped maximum. Existing assertions unchanged.
e Both directions demonstrated? Yes. Pre-fix binary (Build ID cd10b80a): -00:00:01 / 1970-01-01 00:00:05 garbage. Post-fix (Build ID 0439fa74): 999:59:59 / 2106-02-07 06:28:15. 04401 passes 20/20 with randomization; 02900, 03271, 04050 still pass.
f Fix is general across code paths? Yes. All six newly reachable transforms with this pattern are fixed: ToDateTimeTransform64 (UInt64), ToTimeTransform64 (UInt64), ToDateTimeTransform64Signed and ToTimeTransform64Signed (Int64/float), and ToDateTransformFromSecondsOrDays / ToDate32TransformFromSecondsOrDays (which also take wide unsigned sources).
g Fix generalizes across inputs? Yes. Covered UInt64/UInt128/UInt256 above INT64_MAX, Float32/Float64 huge/Inf/-Inf/NaN, and boundary values (exactly at max maps to max, one above saturates, negative Time floor). In-range values and Int64 sources are unchanged (verified).
h Backward compatible? Yes. Only the saturate and ignore results for values that previously produced wraparound/UB garbage change (to the correct clamped boundary). The throw path, in-range values, and existing ignore outputs for representable values are unchanged. No setting default, on-disk, or wire format change.
i Invariants and contracts preserved? Yes. saturateToRange returns a time_t inside [min_bound, max_bound], so the downstream static_cast<ToType> / toDayNum receive a value in the range they already assumed. The throw-mode guards run before the helper, so NaN still raises in throw mode; in non-throw modes NaN clamps to the low bound (documented). No lock/format/ownership contract touched.

Session id: cron:clickhouse-worker-slot-3:20260703-163800

Comment thread src/Functions/FunctionsConversion.h
The Float* -> Date branch (ToDateTransformFromSecondsOrDays) skipped the
day-num/timestamp split for NaN, because both `from < 0` and
`from > DATE_LUT_MAX_DAY_NUM` are false for NaN, then fell through to
`static_cast<UInt16>(from)`, which is undefined behavior for NaN.

The sibling Date32 / DateTime / Time paths already handle this (Date32 has an
explicit NaN guard, DateTime/Time clamp NaN through saturateToRange). Extend
the existing throw-only NaN guard on the Date path to all modes: throw raises,
saturate/ignore clamp to the minimum (1970-01-01), matching the siblings.

Extends 04401 with NaN -> Date/Date32/DateTime/Time regressions in saturate and
ignore modes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

groeneai commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author
Pre-PR validation gate (click to expand)
# Question Answer
a Deterministic repro? Yes. SELECT CAST(nan::Float64, 'Date') SETTINGS date_time_overflow_behavior='saturate' (and ='ignore') reaches static_cast<UInt16>(from) with from = NaN. Isolated a standalone UBSan repro: static_cast<uint16_t>(NaN) compiled with -fsanitize=float-cast-overflow -fno-sanitize-recover traps with runtime error: nan is outside the range of representable values of type 'unsigned short'.
b Root cause explained? In ToDateTransformFromSecondsOrDays::execute, non-throw modes skip the pre-existing throw-only NaN guard. NaN then fails from < 0 (false) and from > DATE_LUT_MAX_DAY_NUM (false, so the saturateToRange branch is not taken), falling through to the final static_cast<UInt16>(from) — undefined behavior for NaN. The sibling Date32/DateTime/Time paths already handle NaN (Date32 guards it explicitly; DateTime/Time route it through saturateToRange, which returns the min bound for NaN). Only the Date path had the gap.
c Fix matches root cause? Yes. The existing throw-only NaN guard is promoted to all modes: if (isNaN(from)) { if (throw) throw…; return 0; }. Throw raises; saturate/ignore clamp to the minimum (day num 0 = 1970-01-01) before the narrowing cast is ever reached. Directly removes the UB path.
d Test intent preserved / new tests added? Preserved. 04401 extended with NaN -> Date/Date32/DateTime/Time in both saturate and ignore modes (and the existing throw-mode NaN cases are unchanged).
e Both directions demonstrated? Yes. Baseline binary (0439fa74) reaches the UB static_cast<UInt16>(NaN) (confirmed via the standalone UBSan trap in (a)); fixed binary (d83aacf4) routes NaN to a defined return 0. Result on x86 coincides (1970-01-01), but the code path changes from UB to defined — that is the actual fix. 04401 passes 30/30 with randomized settings on the fixed binary.
f Fix is general across code paths? Yes. Audited every static_cast<Target>(from) reachable by a float source in the numeric->date/time transforms this PR instantiates: Date (fixed here), Date32 (already guards NaN at the signed-range check), DateTime/Time float paths (ToDateTimeTransform64Signed/ToTimeTransform64Signed, both route NaN through saturateToRange). ToDateTimeTransformSigned/ToTimeTransformSigned are only instantiated for Int8/16/32 (no NaN). Date was the only remaining gap.
g Fix generalizes across inputs (params/datatypes/wrappers)? Yes. NaN reaches the Date path for Float64, Float32, and BFloat16 (all instantiated in the dispatch); the is_floating_point<FromType> guard covers all three. Non-NaN float extremes (huge/Inf/-Inf) already clamp correctly via saturateToRange; integral sources are unaffected (isNaN guard is if constexpr (is_floating_point)-gated). Boundary values (in-range, -1, UInt64::max) verified unchanged.
h Backward compatible? Yes. Only NaN float sources change behavior, and only from undefined behavior to a defined clamp/throw consistent with the setting. Integral sources, in-range floats, and huge/Inf floats are byte-for-byte unchanged in all three modes. No setting default, on-disk/wire format, or new validation on existing data.
i Invariants and contracts preserved? Yes. The transform's contract (throw mode raises VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE; saturate/ignore return a value in [0, MAX]) holds on the NaN path now too (return 0 is a valid day num; throw still raises). No new call, no lifetime/concurrency surface. The NO_SANITIZE_UNDEFINED attribute is retained but the NaN UB it was masking is now eliminated.

Session id: cron:clickhouse-worker-slot-1:20260703-190500

Comment thread src/Functions/FunctionsConversion.h
ToTimeTransformSigned (Int8/Int16/Int32 -> Time) stored the raw value and
ignored the overflow setting, so an out-of-range Int32 stayed verbatim
(toInt32(CAST(4000000::Int32, 'Time')) == 4000000) while the Int64/UInt64/float
paths threw or clamped at 3599999. Give it the same bound handling as
ToTimeTransform64Signed: throw raises VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE,
saturate/ignore clamp to [-MAX_TIME_TIMESTAMP, MAX_TIME_TIMESTAMP]. Int8/Int16
always fit, so the range check is compiled out for them.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

groeneai commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Addressed the Int32 -> Time change request in ab90672 (5th commit).

Pre-PR validation gate (click to expand)
# Question Answer
a Deterministic repro? Yes. On the pre-fix binary, SET date_time_overflow_behavior='throw'; SELECT toInt32(CAST(4000000::Int32,'Time')) returns 4000000 (no throw); 'saturate'/'ignore' also return 4000000. -4000000::Int32 stores -4000000.
b Root cause explained? ToTimeTransformSigned::execute (Int8/Int16/Int32 -> Time) was return static_cast<ToType>(from) — it ignored the date_time_overflow_behavior template parameter entirely, so an out-of-range Int32 (Int32 max 2.1e9 >> Time max 3599999) was stored verbatim, while the Int64/UInt64/float paths threw or clamped. The stored Time thus depended on source width for the same number.
c Fix matches root cause? Yes. Gave ToTimeTransformSigned the same handling as the 64-bit ToTimeTransform64Signed: throw raises VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE; saturate/ignore clamp via saturateToRange to [-MAX_TIME_TIMESTAMP, MAX_TIME_TIMESTAMP].
d Test intent preserved / new tests added? Added 04401 regressions asserting the stored integer via toInt32(...) (not just printed text — 4000000 prints as 999:59:59): Int32 4000000 / -4000000 -> Time under throw (serverError), saturate (3599999 / -3599999), and ignore (3599999 / -3599999). No existing assertions weakened.
e Both directions demonstrated? Yes. Pre-fix binary (Build ID d83aacf4): stores raw 4000000, no throw. Post-fix binary (Build ID 8a50fbf0, distinct): throw raises, saturate/ignore clamp to ±3599999.
f Fix is general across code paths? Audited all newly-reachable numeric -> Date/Date32/DateTime/Time transforms. Only ToTimeTransformSigned had the raw-store gap. ToDateTimeTransformSigned (Int8/16/32 -> DateTime) handles from < 0 and Int32 max (2.1e9) < DateTime max (4.29e9) so it cannot overflow positively — correct as-is. Date/Date32 use the unified FromSecondsOrDays transform (already fixed). Int64/float Time use ToTimeTransform64Signed (already fixed).
g Fix generalizes across inputs? Verified across the narrow-signed source matrix: Int8/Int16 can never exceed ±3599999, so their range check is compiled out (if constexpr numeric_limits<FromType>::max() > MAX_TIME_TIMESTAMP) and their output is unchanged in all three modes; Int32 clamps/throws at both + and - bounds; the in-range boundary 3599999 / -3599999 and interior 12345 are preserved in all modes.
h Backward compatible? Yes. ignore (the default) previously produced the wrong raw value; it now clamps, matching the sibling 64-bit and float paths and the documented ignore semantics for this experimental Time type (allow_experimental_time_time64_type). No setting default or on-disk/wire format changes.
i Invariants and contracts preserved? Yes. saturateToRange guarantees the result is within [min_bound, max_bound] before the single static_cast<Int32>, so the stored Time is always a valid in-range value; the throw path uses the identical bound as the 64-bit variant, so the throw/saturate boundary is consistent across all source widths.

Session id: cron:clickhouse-worker-slot-2:20260703-195400

Comment thread src/Functions/FunctionsConversion.h
…me casts

UInt32 and the wide integer types (Int128/UInt128/Int256/UInt256, plus
BFloat16) were missing from the numeric to DateTime/Time dispatch block, so
they fell through to convertNumericGeneral, which ignores
date_time_overflow_behavior and truncates the value. As a result
CAST(4000000::UInt32, 'Time') SETTINGS date_time_overflow_behavior='throw'
kept 4000000 instead of raising, and a wide value such as UInt128::max
truncated or wrapped negative on the generic path instead of saturating or
throwing.

Add these widths to the unsigned branch (ToDateTimeTransform64/ToTimeTransform64)
and the signed/float branch (ToDateTimeTransform64Signed/ToTimeTransform64Signed).
UInt8/UInt16 always fit both targets and stay on the generic path; UInt32 only
overflows Time (its max equals the DateTime max exactly).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

groeneai commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author
Pre-PR validation gate (click to expand)
# Question Answer
a Deterministic repro? Yes. On the pre-fix branch, CAST(4000000::UInt32, 'Time') SETTINGS date_time_overflow_behavior='throw' returns 999:59:59/keeps the raw value instead of raising; CAST(18446744073709551615::UInt128, 'Time') SETTINGS date_time_overflow_behavior='saturate' returns -00:00:01 (wrap); CAST(99999999999999999999999999::Int256, 'DateTime') SETTINGS date_time_overflow_behavior='saturate' returns a truncated 3825205247 instead of the max. All reproduce on demand.
b Root cause explained? The numeric to DateTime/Time dispatch block in FunctionsConversion.h only routed UInt64 (unsigned) and Int64/Float32/Float64 (signed/float) through the date_time_overflow_behavior-aware transforms. UInt32, UInt128, UInt256, Int128, Int256, and BFloat16 missed every branch and fell through to convertNumericGeneral, which ignores the setting and truncates.
c Fix matches root cause? Yes. The two dispatch branches are extended to the missing widths so they reach the same ToDateTimeTransform64/ToTimeTransform64 (unsigned) and ToDateTimeTransform64Signed/ToTimeTransform64Signed (signed/float) transforms that already honor the setting and clamp in the source domain via saturateToRange. No transform logic changed.
d Test intent preserved / new tests added? Yes. 04401 gains UInt32 -> Time, wide-int -> DateTime/Time regressions in throw (serverError), saturate, and ignore modes, asserting the stored value via toInt32 where the printed text would hide a wrap. Existing assertions unchanged.
e Both directions demonstrated? Yes. Baseline binary (Build ID 8a50fbf0): all cases broken (no throw / wrap / truncate). Fixed binary (Build ID e0b51e1d): all 8 integer widths + BFloat16 throw when out of range, saturate/ignore clamp to the boundary; boundary is exact (3599999 passes, 3600000 throws/clamps).
f Fix is general across code paths? Yes. Enumerated the full source-width x target matrix. UInt8/UInt16 always fit both targets (no route needed). UInt32 max (4294967295) equals the DateTime max exactly, so it can only overflow Time (max 3599999); it is routed for both but only throws/clamps for Time. Date/Date32 already listed every width.
g Fix generalizes across inputs? Yes. Verified UInt32/UInt64/UInt128/UInt256/Int32/Int64/Int128/Int256/BFloat16 in all three modes; positive-huge saturates to max, negative-huge (signed) saturates to min (0 for DateTime, -3599999 for Time), NaN handled by the existing float guards, in-range values unchanged.
h Backward compatible? Yes. Ignore mode (the default) now clamps the wide widths the same way UInt64 already did (no wrap/truncate), matching the existing UInt64 contract; in-range values and UInt8/UInt16 are byte-for-byte unchanged. No setting default or on-disk format changes.
i Invariants and contracts preserved? Yes. saturateToRange uses accurate::greaterOp/lessOp for full-precision cross-signedness/wide/float comparison and only narrows to time_t once the value is proven in range, so no wrap/truncate/UB. The wide-int > comparisons in the throw guards promote via common_type and respect signedness.

Session id: cron:clickhouse-worker-slot-1:20260703-204900

The 'throw' section relied on the old buggy behavior where a numeric cast
toTime(9999999) with date_time_overflow_behavior='throw' silently stored the
raw out-of-range value instead of raising. ClickHouse#109299 makes numeric->Time casts
respect the setting, so that cast now throws (in throw mode) or clamps to the
max visible Time (in saturate/ignore mode).

Rework the section so it still verifies the original intent: a Time column can
internally hold a value beyond the visible range, and Date+Time uses that
internal value, so two identically-printed Times can produce different DateTime
results. The demonstration now uses a Time column (which still carries the
internal value) instead of the numeric cast, and the numeric-cast overflow is
asserted under the new correct semantics (throw raises; saturate/ignore clamp
to 3599999).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

groeneai commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Pushed 0da6374: updated 04092_date_plus_time for the new numeric->Time overflow semantics this PR establishes (Fast test caught it).

The throw section relied on the OLD buggy behavior where toTime(9999999) under date_time_overflow_behavior='throw' silently stored the raw out-of-range value (prints 999:59:59, internally 9999999). This PR makes that cast respect the setting, so it now raises. Reworked the section to preserve the original intent (a Time column can hold a value beyond the visible range, and Date+Time uses the internal value) via a column path that still carries the internal value, and added assertions of the numeric cast under the correct new semantics (throw raises; saturate/ignore clamp to 3599999).

Pre-PR validation gate (click to expand)
# Question Answer
a Deterministic repro? Yes. clickhouse-test 04092_date_plus_time on head 3dca069 fails on the toTime(9999999) line: Code: 321 ... Timestamp value 9999999 is out of bounds of type Time.
b Root cause explained? The test encoded pre-#109299 behavior: numeric toTime(9999999) in throw mode ignored the setting and stored the raw value. This PR routes numeric->Time through the overflow-aware transforms, so the cast now correctly raises in throw mode; the old .reference (expecting 999:59:59/same_time=0) no longer matches.
c Fix matches root cause? Yes. Test-only update to match the PR's intended new invariant; no source change (the source is already correct as of the prior 6 commits).
d Test intent preserved / new tests added? Yes. The intent (Time internally stores a value beyond the visible range; Date+Time uses it; two identically-printed Times give different DateTime results) is preserved via a Time column (same_time=0, same_dt=0), and the numeric-cast overflow is now asserted under the correct semantics (throw -> serverError VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE; saturate/ignore -> clamp to 3599999). No assertion weakened.
e Both directions demonstrated? Yes. OLD .reference on the new binary -> [ FAIL ]; NEW .reference -> [ OK ], 15/15 with randomization. Build ID verified (E0B51E1D).
f Fix is general across code paths? Yes. Grepped the suite for sibling tests with the same old "numeric overflow stores raw" assumption. 03365_time_time64_conversions / 03365_time_time64_cap_max_time use numeric toTime(bignum) under default (ignore) mode and assert the displayed clamp 999:59:59 (display layer, unchanged) - verified still pass. Only 04092 encoded the throw-stores-raw assumption.
g Fix generalizes across inputs? N/A (test-adaptation, not a code bug). The code fix generalizing across widths/wrappers was validated in the prior 6 commits + their gate comments.
h Backward compatible? N/A (test-only). No setting default, format, or validation change here.
i Invariants and contracts preserved? N/A (test-only). No production code touched in this commit.

Session id: cron:clickhouse-worker-slot-3:20260703-222500

Comment thread tests/queries/0_stateless/04092_date_plus_time.sql Outdated
…eTime/Time

The VALUES/INSERT expression path (and IN-clause set construction and
string-comparison coercion) goes through convertFieldToType, whose
DateTime/Time branches stored the raw numeric value regardless of
date_time_overflow_behavior. This made numeric coercion inconsistent with
toDateTime/toTime/CAST, which already honour the setting.

Route the DateTime (UInt64/Int64 -> [0, 0xFFFFFFFF]) and Time
(UInt64/Int64 -> [-3599999, 3599999]) Field branches through a helper that
throws in 'throw' mode and clamps in 'saturate'/'ignore' mode, mirroring the
cast transforms. Date/Date32 already range-check via convertNumericType.

Plain integer literals in VALUES take the text serializer path instead
(shared by all input formats, pre-existing), which is left unchanged here.

Update 04092_date_plus_time so it no longer relies on the raw-store behaviour
and extend 04401 with numeric VALUES expression regressions.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

groeneai commented Jul 4, 2026

Copy link
Copy Markdown
Contributor Author
Pre-PR validation gate (click to expand)
# Question Answer
a Deterministic repro? Yes. SET date_time_overflow_behavior='throw'; INSERT INTO t (x Time) VALUES (9999999 + 0) stored the raw 9999999 (and DateTime stored a clamped/wrapped value) on demand; matched on the base binary and disappears with the fix.
b Root cause explained? The VALUES/INSERT expression path (and IN-clause set construction and string comparison) coerce Fields through convertFieldToType, whose DateTime/Time branches did return src unconditionally, ignoring date_time_overflow_behavior. So numeric coercion there stored the raw out-of-range value while toTime/toDateTime/CAST (fixed earlier in this PR) throw or clamp.
c Fix matches root cause? Yes. The two branches now route through a helper that throws in throw mode and clamps to the target range in saturate/ignore (Time [-3599999, 3599999], DateTime [0, 0xFFFFFFFF]), mirroring the cast transforms.
d Test intent preserved / new tests added? Yes. 04092 no longer relies on the raw-store bypass (test_internal_time now uses two in-range Times to show Date+Time uses the internal seconds) and asserts the new consistent VALUES behavior. 04401 gains a numeric-VALUES-expression section (throw/saturate/ignore + in-range).
e Both directions demonstrated? Yes. With old references the new binary FAILs exactly on the changed section; with new references 04092 and 04401 pass 15/15 under randomization (real clickhouse-test runner).
f Fix is general across code paths? Yes. Fixed both DateTime and Time Field branches (UInt64 and Int64 sources). Date/Date32 Field branches already range-check via convertNumericType and are consistent. The fix targets the coercion root, not a per-symptom guard.
g Fix generalizes across inputs? Yes for the Field-coercion path: covers UInt64 and Int64 sources, positive over-max and negative under-min, and exact boundaries (3599999 / 0xFFFFFFFF pass, one past throws/clamps); Float VALUES expressions and NULLs already go through the cast/CAST wrapper and stay correct.
h Backward compatible? Behavior change is confined to out-of-range numeric Field coercion and follows the setting (default ignore now clamps, matching the cast path this PR already ships). In-range values, integral in-range, and all three modes are byte-identical to before; no format/setting-default change, so SettingsChangesHistory.cpp is not involved.
i Invariants and contracts preserved? Yes. The helper returns a UInt64 Field when non-negative and Int64 when negative, matching the exact two representations the Time/DateTime branches already accepted for in-range values, so downstream serializers see the same Field shape. Range bounds mirror MAX_TIME_TIMESTAMP / MAX_DATETIME_TIMESTAMP.

Scope note: plain integer literals in VALUES (e.g. VALUES (9999999)) take a separate path (ValuesBlockInputFormat fast path -> SerializationTime/SerializationDateTime text deserializer -> readIntText), which is shared by all text input formats, pre-existing (this PR does not touch serializers), and whose Time64/DateTime64 siblings belong to the deferred #101512. That path is intentionally left for a follow-up; details in the reply below.

Session id: cron:clickhouse-worker-slot-1:20260703-233700

groeneai and others added 2 commits July 4, 2026 09:27
…ric-date-casts

Resolve conflict in src/Interpreters/convertFieldToType.cpp between this PR's
date_time_overflow_behavior-aware numeric Field coercion and master's fix for
STID 3993-36c1 (LOGICAL_ERROR on OPTIMIZE PARTITION for a Time-typed key,
commits bb7f73c / 3701b73).

For the Time branch, route the value through convertNumericType<Int32> first,
so the result is the canonical Int64 Field that Time part loading produces
(keeps getPartitionIDFromQuery from raising a LOGICAL_ERROR, and rejects values
outside the Int32 storage range with ARGUMENT_OUT_OF_BOUND), then apply
date_time_overflow_behavior against the visible Time range. Make
coerceNumericFieldToDateTimeOrTime return the target's canonical Field
representation (Int64 for Time, UInt64 for DateTime) for both in-range and
clamped values so a clamped partition value still compares equal on read-back.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Fast test failed on 04401 and 04092: an out-of-range numeric VALUES
expression such as INSERT INTO t VALUES (9999999 + 0) into a Time/DateTime
column can surface the VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE error either as a
server error or as a client error, depending on async_insert (the client
parses VALUES data locally). The tests asserted { serverError ... }, which
does not match when the overflow is raised client-side, so the runner
reported "Expected server error code '321' but got no server error".

Switch those INSERT ... VALUES markers to { error ... }, which accepts an
error from either side (same approach as 00700_decimal_bounds). Pure SELECT
and CAST assertions keep { serverError } since they run entirely server-side.
Regenerate the 04092 reference for the echo-mode text change.

Also drop the const on the local time_field in convertFieldToType so it can
be moved on return, fixing the arm_tidy performance-no-automatic-move error
introduced by the master merge resolution.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

groeneai commented Jul 4, 2026

Copy link
Copy Markdown
Contributor Author

Follow-up commit 34a2cd5d fixes the Fast test and Build (arm_tidy) failures reported on the previous head 510bcc7a.

Fast test root cause: an out-of-range numeric VALUES expression such as INSERT INTO t VALUES (9999999 + 0) into a Time/DateTime column can surface the VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE error either as a server error or as a client error depending on async_insert (the client parses the VALUES data locally, so with async_insert=0 the overflow is raised client-side; with the default async_insert=1 it is raised server-side while executing the async-insert pipeline). The tests asserted { serverError ... }, which does not match when the error is raised client-side, so CI reported Expected server error code '321' but got no server error. Switched those INSERT ... VALUES markers to { error ... }, which matches an error from either side (same approach as 00700_decimal_bounds). Pure SELECT/CAST assertions keep { serverError } since they run entirely server-side.

arm_tidy root cause: the master-merge resolution introduced const Field time_field = ... in convertFieldToType.cpp; return time_field; then trips performance-no-automatic-move (-warnings-as-errors). Dropped the const so the value can be moved on return (it is still passed by const-ref to coerceNumericFieldToDateTimeOrTime).

Pre-PR validation gate (click to expand)
# Question Answer
a Deterministic repro? Yes. Fast test: INSERT INTO t VALUES (9999999 + 0) with date_time_overflow_behavior='throw' under async_insert=0 raises a client error, so the old { serverError } marker fails with return code 65, "Expected server error code '321' but got no server error" (exact CI signature). arm_tidy: clang-tidy-21 -p build --checks='-*,performance-no-automatic-move' convertFieldToType.cpp reports convertFieldToType.cpp:423:24: error: constness of 'time_field' prevents automatic move on the const version.
b Root cause explained? Yes. VALUES data is parsed by the client (ClientBase -> ValuesBlockInputFormat -> convertFieldToType), so an overflow there is a client error under async_insert=0 but a server error under async_insert=1 (async-insert pipeline). A fixed { serverError } marker only matches one path; CI hit the other. The tidy error is a plain const-prevents-move on the returned local.
c Fix matches root cause? Yes. { error ... } sets both client_errors and server_errors (TestHint.cpp), matching the overflow regardless of which side raises it. Removing const allows the move clang-tidy requires. Neither weakens any assertion.
d Test intent preserved / new tests added? Yes. The tests still assert the SAME error code (VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE) for the SAME out-of-range inputs; only the client-vs-server attribution is relaxed. All SELECT/CAST server-side assertions keep { serverError }. No coverage lost.
e Both directions demonstrated? Yes. Old { serverError } under async_insert=0: FAIL (return code 65). New { error }: 04401 + 04092 pass 10/10 under async_insert=1 (default) and 6/6 under async_insert=0, via the real clickhouse-test runner (Build ID 985d8b5c). arm_tidy: diagnostic present on const, absent after removing it.
f Fix is general across code paths? Yes. Grepped both test files: every INSERT ... VALUES ...; { serverError ... } (the only client/server-ambiguous form) was switched to { error }; no other test in the suite encodes this assumption for these files. The const-move fix is the only such site in the merged diff.
g Fix generalizes across inputs? N/A for the tidy change (mechanical const removal). For the test markers: covered both Time and DateTime, positive and negative overflow, and both async_insert modes; the marker is value-independent.
h Backward compatible? Yes. Test-only marker change and a behavior-neutral const removal. No settings, formats, or defaults changed; binary behavior identical (Build ID changed, test results unchanged).
i Invariants and contracts preserved? Yes. time_field is consumed by-const-ref after the move-eligible return; the non-const local upholds every use. { error } is an established TestHint contract (accepts client or server error). No data-structure or concurrency invariant touched.

Session id: cron:clickhouse-worker-slot-0:20260704-101700

Comment on lines +405 to +427

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper only fixes the Int64/UInt64 slice of convertFieldToType, so the shared VALUES/IN coercion path is still inconsistent with the overflow-aware CAST path. evaluateConstantExpression can hand this function Float64, UInt128/UInt256, or Int128/Int256 fields, and those still miss both branches here and end in TYPE_MISMATCH instead of clamping or raising VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE. The sibling Date/Date32 branches below are unchanged as well, so INSERT INTO d VALUES (99999999999 + 0) still rejects and INSERT INTO d32 VALUES (999999999999 + 0) can still narrow through ColumnVector<Int32>::insert instead of respecting date_time_overflow_behavior.

Because ValuesBlockInputFormat, IN-set construction, and string comparison all share convertFieldToType, this PR still leaves a real split between numeric CAST and numeric constant coercion. These branches need the same overflow-aware conversion for all four targets, plus 04401 coverage for Date/Date32 and non-Int64/UInt64 DateTime/Time expressions.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I checked each case on the current head (34a2cd5); it splits three ways.

VALUES expressions (your ... + 0 reproducers) are already correct in all modes because they go through the analyzer's _CAST, which this PR made overflow-aware:

  • INSERT INTO d VALUES (99999999999 + 0) -> 2106-02-07 (ignore/saturate), raises VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE (throw)
  • INSERT INTO d32 VALUES (999999999999 + 0) -> 2299-12-31 (ignore/saturate), raises (throw)
  • Float64/UInt128 expr -> DateTime/Time likewise raise (throw) / clamp (saturate/ignore)

So the Date/Date32 cases here do not reject or narrow; they clamp/throw via the cast path.

The real gap is the IN-set / constant-coercion path (SetUtils, strict=true) for the non-Int64/UInt64 fields you list (Float64, UInt128/UInt256, Int128/Int256) which end in TYPE_MISMATCH. That is real, but it is pre-existing and orthogonal to this PR's contract: these branches only ever accepted UInt64/Int64, so even an in-range SELECT toDateTime(5) IN (5.0::Float64) throws TYPE_MISMATCH on master too. This PR neither introduced nor widened it.

Extending it is not a mechanical mirror of the Int64 path, because IN-set construction runs with strict=true, whose purpose is to exclude lossy/out-of-range members (see the Decimal handling). A clamp-based extension produces false positives: SELECT toDateTime(4294967295) IN (99999999999) already returns 1 on this branch, because the out-of-range literal clamps onto the boundary and compares equal (before this PR the branch returned the raw value, so that comparison was false). Doing float/wide-int "right" means choosing clamp-vs-exclude under strict, and ideally revisiting the same for the Int64 path, which shifts IN-membership semantics more broadly.

I would prefer not to change strict IN-membership semantics inside this overflow-behavior PR without a maintainer's call, since that is a separate, backward-compat-sensitive behavior. Happy to implement it here or as a follow-up per the preferred scope. This PR's stated scope is the numeric CAST/toX path (the DateTime64/Time64 half is #101512); the convertFieldToType change already added here closes the VALUES-expression side that the test would otherwise codify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants