iframe-proxy

linrrzqqq · 2026-04-23T13:26:26Z

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #58002

Problem Summary:

arrow::Decimal256Array::Value() returns raw bytes (const uint8_t*), we previously reinterpret_cast it to Decimal256* which is a misaligned access .

../src/core/data_type_serde/data_type_decimal_serde.cpp:376:21: runtime error: reference binding to misaligned address 0x1338f3345b12 for type 'const FieldType' (aka 'const Decimal<integer<256, int>>'), which requires 8 byte alignment
0x1338f3345b12: note: pointer points here
 00 c2  3e 20 00 00 10 63 2d 5e  c7 6b 05 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
              ^ 
    #0 0x56203d1eb751 in doris::DataTypeDecimalSerDe<(doris::PrimitiveType)35>::read_column_from_arrow(doris::IColumn&, arrow::Array const*, long, long, cctz::time_zone const&) const be/build_ASAN/../src/core/data_type_serde/data_type_decimal_serde.cpp:375:25
    #1 0x56203d9c6d6b in doris::DataTypeNullableSerDe::read_column_from_arrow(doris::IColumn&, arrow::Array const*, long, long, cctz::time_zone const&) const be/build_ASAN/../src/core/data_type_serde/data_type_nullable_serde.cpp:350:26
    #2 0x562055dd83b5 in doris::FromRecordBatchToBlockConverter::convert(doris::Block*) be/build_ASAN/../src/format/arrow/arrow_block_convertor.cpp:131:9
    #3 0x562055dd9229 in doris::convert_from_arrow_batch(std::shared_ptr<arrow::RecordBatch> const&, std::vector<std::shared_ptr<doris::IDataType const>,

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

hello-stephen · 2026-04-23T13:26:32Z

linrrzqqq · 2026-04-23T13:32:30Z

run buildall

linrrzqqq · 2026-04-23T13:32:34Z

/review

github-actions

Found 1 blocking issue.

[High] be/src/core/data_type_serde/data_type_decimal_serde.cpp:344 still performs an unaligned load in the TYPE_DECIMALV2 branch. This PR correctly replaces the TYPE_DECIMAL256 raw-pointer dereference with memcpy, but *reinterpret_cast<const Decimal128V2*>(concrete_array->Value(value_i)) remains in the same Arrow import path. arrow::DecimalArray::Value() also returns raw bytes, so a record batch backed by an unaligned buffer can still trigger UBSan/undefined behavior when importing DECIMALV2 columns through read_column_from_arrow (for example via FromRecordBatchToBlockConverter).

Critical checkpoint conclusions:

Goal of current task: Partially achieved. The Decimal256 path is fixed and a Decimal256 unit test was added, but the broader unaligned Arrow decimal-read problem is not fully eliminated because the DecimalV2 branch still uses the unsafe cast.
Small/clear/focused: Yes. The patch is minimal and localized.
Concurrency: Not involved. No thread-safety concerns in this change.
Lifecycle/static initialization: No new lifecycle or static-init issues introduced.
Configuration: None.
Compatibility/storage format: None. No FE/BE protocol or persistence changes.
Parallel code paths: Not fully covered. The sibling TYPE_DECIMALV2 branch in the same function still uses the old reinterpret-cast pattern.
Special conditional checks: None added.
Test coverage: Improved but incomplete. The new unit test covers the Decimal256 case only; a DecimalV2 misalignment case is still missing.
Test result changes: New test logic for Decimal256 looks reasonable, but it does not prove the remaining DecimalV2 path is safe.
Observability: No additional observability needed for this local fix.
Transaction/persistence/data-write concerns: Not applicable.
Performance: The memcpy approach is appropriate here and matches the existing Decimal32/64/128I handling; no material performance concern.
Other issues: No additional blockers found beyond the remaining DecimalV2 unaligned read.

User focus points: there were no additional user-provided focus points in /tmp/review_focus.txt, so no extra focus-specific findings to report.

github-actions · 2026-04-23T13:46:25Z

-            column_data.emplace_back(
-                    *reinterpret_cast<const FieldType*>(concrete_array->Value(value_i)));
+            FieldType decimal_value {};
+            memcpy(&decimal_value, concrete_array->Value(value_i), sizeof(FieldType));


This fixes the TYPE_DECIMAL256 unaligned read, but the sibling TYPE_DECIMALV2 branch a few lines above still does *reinterpret_cast<const Decimal128V2*>(concrete_array->Value(value_i)). arrow::DecimalArray::Value() has the same raw-byte contract, so an unaligned Arrow buffer can still trigger UBSan/undefined behavior for DECIMALV2 imports through the same read_column_from_arrow path. Please switch that branch to the same memcpy pattern and add a DECIMALV2 misalignment test; otherwise this PR only fixes one instance of the bug.

hello-stephen · 2026-04-23T15:16:44Z

BE UT Coverage Report

Increment line coverage 100.00% (3/3) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	53.33% (20370/38194)
Line Coverage	36.88% (192050/520731)
Region Coverage	33.20% (149410/449976)
Branch Coverage	34.31% (65339/190454)

hello-stephen · 2026-04-23T15:43:17Z

BE Regression && UT Coverage Report

Increment line coverage 100.00% (3/3) 🎉

Increment coverage report
Complete coverage report

[Fix](decimal) Fix unaligned memory access in read_column_from_arrow

1e45a83

github-actions Bot requested changes Apr 23, 2026

View reviewed changes

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix](decimal) Fix unaligned memory access in read_column_from_arrow#62759

[Fix](decimal) Fix unaligned memory access in read_column_from_arrow#62759
linrrzqqq wants to merge 1 commit intoapache:masterfrom
linrrzqqq:arrow-mem-align

linrrzqqq commented Apr 23, 2026

Uh oh!

hello-stephen commented Apr 23, 2026

Uh oh!

linrrzqqq commented Apr 23, 2026

Uh oh!

linrrzqqq commented Apr 23, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot Apr 23, 2026

Uh oh!

hello-stephen commented Apr 23, 2026

Uh oh!

hello-stephen commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sunbelt Computer Software

PL/B Language Development and Support

Conversation

linrrzqqq commented Apr 23, 2026

What problem does this PR solve?

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

hello-stephen commented Apr 23, 2026

Uh oh!

linrrzqqq commented Apr 23, 2026

Uh oh!

linrrzqqq commented Apr 23, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

hello-stephen commented Apr 23, 2026

BE UT Coverage Report

Uh oh!

hello-stephen commented Apr 23, 2026

BE Regression && UT Coverage Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants