{{ message }}
Unpoison SVE scalar reductions for MemorySanitizer#19
Merged
Conversation
Fix: Jensen Shannon square roots (ashvardanian#233)
Infer boolean types in Python
Previous AVX-512 implementation of complex products used an extra ZMM register for `swap_adjacent_vec`. Moreover, they used the `vpshufb` instruction available only with the Ice Lake capability and newer. The replacement uses the `_mm512_permute_ps` and its double-precision variant.
Previous bit-cast would round towards zeros. When the true value is close to an integer boundary (e.g., 8.0): - SciPy (float64): might get 8.0000000001 → truncates to 8 - SimSIMD (float32): might get 7.9999998 → truncates to 7
### Patch - Make: Same upload/download CI versions (ae9e567)
…n#296) GCC and Clang do not recognize the target attribute. The correct flag for AVX-VNNI (AVX2-VNNI) instructions is . This fix enables SIMSIMD_TARGET_SIERRA to compile successfully with both GCC and Clang, allowing runtime dispatch to select optimal SIERRA kernels on compatible CPUs. Changes: - include/simsimd/dot.h: avx2vnni -> avxvnni in pragma directives - include/simsimd/spatial.h: avx2vnni -> avxvnni in pragma directives Tested with: - GCC 15.2.0 - Clang 20.1.8 Co-authored-by: Seungwon Yang <seungwon.yang@navercorp.com>
### Patch - Fix: Replace `avx2vnni` with `avxvnni` for Sierra Forest (ashvardanian#296) (a8bb232) - Make: Remove `NPM_TOKEN` for OIDC publishing (13cd5bc) - Make: Sign rebase with GitHub Actions bot (e7b89b5) - Fix: Revert to `atol=1` for test integer outputs vs SciPy (b75bdbd)
Co-authored-by: Markus Graf <24669860+markusalbertgraf@users.noreply.github.com>
### Patch - Fix: Wrong predicate width in BF16 SVE L2 kernel (ashvardanian#301) (87ae846) - Improve: FreeBSD comp-time target selection (ashvardanian#300) (cb11f8b)
…n#302) `simsimd_capabilities` probes SIMD instructions with an uninitialized `dummy_input` buffer and `n=0`. SVE implementations use `do { ... } while` loops that always execute the body once. MemorySanitizer doesn't understand SVE predicated loads and reports use-of-uninitialized-value. Initialize the buffer to zero to silence the false positive. Co-authored-by: Alexey Milovidov <18581488+alexey-milovidov@users.noreply.github.com>
### Patch - Fix: Initialize `dummy_input` to fix MSan false positive (ashvardanian#302) (c2ad842)
MSan cannot track initialization through SIMD intrinsics (SVE, NEON, SSE, AVX), causing false-positive "use-of-uninitialized-value" reports. Add `__msan_unpoison` calls after every dispatch macro to mark results as initialized. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MemorySanitizer cannot track data flow through SIMD intrinsics
(SVE, NEON, SSE, AVX), causing false-positive "use-of-uninitialized-value"
reports. For example, SVE predicated loads only access memory for active
lanes, but MSan sees the full vector width as a memory read, flagging
tail elements as uninitialized.
Add `__attribute__((no_sanitize("memory")))` to `SIMSIMD_PUBLIC` and
`SIMSIMD_DYNAMIC` macros when MSan is detected via `__has_feature`.
This disables MSan instrumentation for all SimSIMD functions, which is
appropriate since they are entirely SIMD code that MSan cannot analyze.
The previous approach of unpoisoning results after dispatch (in lib.c)
was insufficient because MSan aborts inside the function body before
the dispatch wrapper can unpoison the output.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`simsimd_capabilities` probes SIMD functions with n=0 to pre-initialize
dispatch function pointers. SVE implementations use `do { } while (i < n)`
loops that always execute the body once, even with n=0. MemorySanitizer
instruments SVE predicated loads (`svld1_f32` etc.) as full-width vector
reads regardless of the predicate mask, so it reports use-of-uninitialized
memory when the buffer is smaller than the SIMD register width.
Increase the dummy buffer from 8 bytes (`double[1]`) to 256 bytes
(`double[32]`) to cover the widest possible SVE vector (2048 bits).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix MSan false positive: enlarge dummy buffer for SVE predicated loads
Include `<unistd.h>` for ARM POSIX capability detection
LLVM's MSan does not instrument ARM SVE intrinsics — there is zero handling of `aarch64_sve_*` in MemorySanitizer.cpp (as of LLVM 20), only a file-level "FIXME: This sanitizer does not yet handle scalable vectors". This causes false positives when `svaddv` produces scalar results that MSan considers uninitialized. Add `SIMSIMD_UNPOISON` calls after every `svaddv` reduction in SVE function bodies across spatial.h, dot.h, binary.h, and sparse.h. Define the macro in types.h with `__msan_unpoison` when MSan is active. See also: llvm/llvm-project#165028
alexey-milovidov
added a commit
to ClickHouse/ClickHouse
that referenced
this pull request
Apr 6, 2026
LLVM's MSan does not instrument ARM SVE intrinsics (no handling of `aarch64_sve_*` in MemorySanitizer.cpp as of LLVM 20). This causes false positives when `svaddv` produces scalar results that MSan considers uninitialized. Add `SIMSIMD_UNPOISON` calls after every `svaddv` reduction in SVE function bodies. https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=101239&sha=8abafa3f0d5fe90ff3bae820f1a3da291249d26e&name_0=PR&name_1=Stress%20test%20%28arm_msan%29 Contrib PR: ClickHouse/SimSIMD#19 Upstream PR: ashvardanian/NumKong#342
1 task
alexey-milovidov
added a commit
to ClickHouse/ClickHouse
that referenced
this pull request
Apr 6, 2026
LLVM's MSan does not instrument ARM SVE intrinsics (no handling of `aarch64_sve_*` in MemorySanitizer.cpp as of LLVM 20). This causes false positives when `svaddv` produces scalar results that MSan considers uninitialized. Add `SIMSIMD_UNPOISON` calls after every `svaddv` reduction in SVE function bodies. https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=101239&sha=8abafa3f0d5fe90ff3bae820f1a3da291249d26e&name_0=PR&name_1=Stress%20test%20%28arm_msan%29 Contrib PR: ClickHouse/SimSIMD#19 Upstream PR: ashvardanian/NumKong#342
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

LLVM's MSan does not instrument ARM SVE intrinsics — there is zero handling of
aarch64_sve_*inMemorySanitizer.cpp(as of LLVM 20), only a file-levelFIXME: This sanitizer does not yet handle scalable vectors. This causes false positives whensvaddvproduces scalar results that MSan considers uninitialized.Add
SIMSIMD_UNPOISONcalls after everysvaddvreduction in SVE function bodies acrossspatial.h,dot.h,binary.h, andsparse.h. Define the macro intypes.hwith__msan_unpoisonwhen MSan is active.See also: llvm/llvm-project#165028