[Fix](decimal) Fix unaligned memory access in read_column_from_arrow by linrrzqqq · Pull Request #62759 · apache/doris · GitHub
Skip to content

[Fix](decimal) Fix unaligned memory access in read_column_from_arrow#62759

Open
linrrzqqq wants to merge 1 commit intoapache:masterfrom
linrrzqqq:arrow-mem-align
Open

[Fix](decimal) Fix unaligned memory access in read_column_from_arrow#62759
linrrzqqq wants to merge 1 commit intoapache:masterfrom
linrrzqqq:arrow-mem-align

Conversation

@linrrzqqq
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #58002

Problem Summary:

arrow::Decimal256Array::Value() returns raw bytes (const uint8_t*), we previously reinterpret_cast it to Decimal256* which is a misaligned access .

../src/core/data_type_serde/data_type_decimal_serde.cpp:376:21: runtime error: reference binding to misaligned address 0x1338f3345b12 for type 'const FieldType' (aka 'const Decimal<integer<256, int>>'), which requires 8 byte alignment
0x1338f3345b12: note: pointer points here
 00 c2  3e 20 00 00 10 63 2d 5e  c7 6b 05 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
              ^ 
    #0 0x56203d1eb751 in doris::DataTypeDecimalSerDe<(doris::PrimitiveType)35>::read_column_from_arrow(doris::IColumn&, arrow::Array const*, long, long, cctz::time_zone const&) const be/build_ASAN/../src/core/data_type_serde/data_type_decimal_serde.cpp:375:25
    #1 0x56203d9c6d6b in doris::DataTypeNullableSerDe::read_column_from_arrow(doris::IColumn&, arrow::Array const*, long, long, cctz::time_zone const&) const be/build_ASAN/../src/core/data_type_serde/data_type_nullable_serde.cpp:350:26
    #2 0x562055dd83b5 in doris::FromRecordBatchToBlockConverter::convert(doris::Block*) be/build_ASAN/../src/format/arrow/arrow_block_convertor.cpp:131:9
    #3 0x562055dd9229 in doris::convert_from_arrow_batch(std::shared_ptr<arrow::RecordBatch> const&, std::vector<std::shared_ptr<doris::IDataType const>, 

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

run buildall

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 1 blocking issue.

  1. [High] be/src/core/data_type_serde/data_type_decimal_serde.cpp:344 still performs an unaligned load in the TYPE_DECIMALV2 branch. This PR correctly replaces the TYPE_DECIMAL256 raw-pointer dereference with memcpy, but *reinterpret_cast<const Decimal128V2*>(concrete_array->Value(value_i)) remains in the same Arrow import path. arrow::DecimalArray::Value() also returns raw bytes, so a record batch backed by an unaligned buffer can still trigger UBSan/undefined behavior when importing DECIMALV2 columns through read_column_from_arrow (for example via FromRecordBatchToBlockConverter).

Critical checkpoint conclusions:

  • Goal of current task: Partially achieved. The Decimal256 path is fixed and a Decimal256 unit test was added, but the broader unaligned Arrow decimal-read problem is not fully eliminated because the DecimalV2 branch still uses the unsafe cast.
  • Small/clear/focused: Yes. The patch is minimal and localized.
  • Concurrency: Not involved. No thread-safety concerns in this change.
  • Lifecycle/static initialization: No new lifecycle or static-init issues introduced.
  • Configuration: None.
  • Compatibility/storage format: None. No FE/BE protocol or persistence changes.
  • Parallel code paths: Not fully covered. The sibling TYPE_DECIMALV2 branch in the same function still uses the old reinterpret-cast pattern.
  • Special conditional checks: None added.
  • Test coverage: Improved but incomplete. The new unit test covers the Decimal256 case only; a DecimalV2 misalignment case is still missing.
  • Test result changes: New test logic for Decimal256 looks reasonable, but it does not prove the remaining DecimalV2 path is safe.
  • Observability: No additional observability needed for this local fix.
  • Transaction/persistence/data-write concerns: Not applicable.
  • Performance: The memcpy approach is appropriate here and matches the existing Decimal32/64/128I handling; no material performance concern.
  • Other issues: No additional blockers found beyond the remaining DecimalV2 unaligned read.

User focus points: there were no additional user-provided focus points in /tmp/review_focus.txt, so no extra focus-specific findings to report.

column_data.emplace_back(
*reinterpret_cast<const FieldType*>(concrete_array->Value(value_i)));
FieldType decimal_value {};
memcpy(&decimal_value, concrete_array->Value(value_i), sizeof(FieldType));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the TYPE_DECIMAL256 unaligned read, but the sibling TYPE_DECIMALV2 branch a few lines above still does *reinterpret_cast<const Decimal128V2*>(concrete_array->Value(value_i)). arrow::DecimalArray::Value() has the same raw-byte contract, so an unaligned Arrow buffer can still trigger UBSan/undefined behavior for DECIMALV2 imports through the same read_column_from_arrow path. Please switch that branch to the same memcpy pattern and add a DECIMALV2 misalignment test; otherwise this PR only fixes one instance of the bug.

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 100.00% (3/3) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.33% (20370/38194)
Line Coverage 36.88% (192050/520731)
Region Coverage 33.20% (149410/449976)
Branch Coverage 34.31% (65339/190454)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (3/3) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.35% (26708/37433)
Line Coverage 53.75% (279142/519342)
Region Coverage 47.02% (214172/455456)
Branch Coverage 50.35% (97012/192667)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants