You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(disclaimer - the bug itself is real, reproduced in project by me manually in production, then under debugger, and happens exactly where it is reported below, but the deep details and standalone reproducer were collected and prepared by Claude Opus 4.8, hope it got it all well)
On ARM64 / ARM64EC, basic_string_view::find_first_of / basic_string::find_first_of for 2‑byte element types (char16_t, wchar_t, u16string) enters an infinite loop (100% CPU, never returns) for a broad class of ordinary inputs. The Neon "bitmap" worker _Find_meow_of::_Bitmap_impl::_Impl_first_neon (in stl/src/vector_algorithms.cpp) advances its scalar‑tail index with a standalone ++_Ix; at the bottom of a do { … } while (_Ix != _Haystack_length); loop, but continues past that increment whenever a tail element is >= 256. The index therefore never advances and the loop never terminates.
This is not the same bug as #5757 / PR #5758 (that was an x64/SSE out‑of‑bounds crash caused by a bad bitmap‑activation check for needles containing elements ≥ 256). This one is ARM64‑only, is a hang rather than a crash, is triggered by needles whose elements are all < 256, and lives in different code introduced later.
stl/src/vector_algorithms.cpp, _Impl_first_neon (current main, ~line 6342). The 16‑wide Neon loop runs over _Haystack_length & ~size_t{15}, then a scalar tail handles the remaining 1–15 elements:
do {
const _Ty _Val = _Hay_ptr[_Ix];
ifconstexpr (sizeof(_Val) > 1) {
if (_Val >= 256) {
ifconstexpr (_Pred == _Predicate::_Any_of) {
continue; // (A) jumps straight to the while-condition
} else {
return _Ix;
}
}
}
/* ... bitmap test, may return _Ix ... */
++_Ix; // (B) the ONLY place _Ix advances
} while (_Ix != _Haystack_length); // (C) bypassed by (A)
In a do/while, continue (A) transfers control to the condition (C), skipping the ++_Ix; at (B). So as soon as a tail element satisfies _Val >= 256 under _Predicate::_Any_of (which is exactly what find_first_of / find_first_not_of’s any‑of direction uses), _Ix is frozen and _Ix != _Haystack_length stays true forever.
For 1‑byte _Ty the if constexpr (sizeof(_Val) > 1) branch is discarded, so __std_find_first_of_trivial_pos_1 (std::string) is unaffected.
Why it strikes ordinary inputs
For sizeof(_Ty) == 2, _Use_bitmap_neon selects this worker once the haystack is >= 96 code units (for a 4‑element needle). Any UTF‑16 search where:
the needle elements are all < 256 (e.g. Qt’s u"<>&\""),
there is no match (search runs to the end),
the haystack length is not a multiple of 16, and
at least one of the final 1–15 code units is >= 256 (any non‑Latin‑1 character: Cyrillic, CJK, an emoji surrogate half, etc.)
…hangs. That is an extremely common shape — e.g. QString::toHtmlEscaped() on any longer message whose tail contains a non‑Latin character.
Sibling functions are correct (suggested shape for the fix)
_Impl_last_neon (the find_last_of direction, ~line 6402) is safe: it does --_Ix; as the first statement inside its do body, so its continue still advances.
_Impl_first_scalar (the x64/SSE scalar tail) is safe: it uses for (size_t _Ix = 0; _Ix != _Haystack_length; ++_Ix), where continue runs the loop’s step expression.
Fix: make the increment unconditional in _Impl_first_neon — e.g. rewrite the tail as a for (size_t _Ix = _Vec_end; _Ix != _Haystack_length; ++_Ix) loop (matching _Impl_first_scalar) and delete the bottom ++_Ix;, or move the advance into the _Val >= 256 branch.
Affected configurations
Architectures: ARM64 and ARM64EC (the #if defined(_M_ARM64) || defined(_M_ARM64EC) Neon path). x64/x86 unaffected.
Element sizes: 2‑byte and larger (wchar_t, char16_t, char32_t); 1‑byte (char) is fine.
Predicates:find_first_of (_Any_of). find_first_not_of (_None_of) returns instead of continue, so it is not stuck (but _Any_of is the common case).
Version: introduced by PR Add Neon bitmap implementation of find_first_of #6115 “Add Neon bitmap implementation of find_first_of” (merged 2026‑02‑28, commit 9980b75); per the STL changelog this shipped in the MSVC 14.51 toolset (VS 2022 17.14.x). Still present on main at the time of this report.
Real‑world impact
Surfaced as a 100%‑reproducible UI‑thread deadlock in Telegram Desktop (official ARM64 build) when copying selected messages with Ctrl+C: QString::toHtmlEscaped() → std::u16string_view::find_first_of(u"<>&\"") over message text whose tail contains a non‑Latin character. Any code using QString::toHtmlEscaped() or find_first_of/find_first_not_of on UTF‑16 is exposed on ARM64.
Summary
(disclaimer - the bug itself is real, reproduced in project by me manually in production, then under debugger, and happens exactly where it is reported below, but the deep details and standalone reproducer were collected and prepared by Claude Opus 4.8, hope it got it all well)
On ARM64 / ARM64EC,
basic_string_view::find_first_of/basic_string::find_first_offor 2‑byte element types (char16_t,wchar_t,u16string) enters an infinite loop (100% CPU, never returns) for a broad class of ordinary inputs. The Neon "bitmap" worker_Find_meow_of::_Bitmap_impl::_Impl_first_neon(instl/src/vector_algorithms.cpp) advances its scalar‑tail index with a standalone++_Ix;at the bottom of ado { … } while (_Ix != _Haystack_length);loop, butcontinues past that increment whenever a tail element is>= 256. The index therefore never advances and the loop never terminates.This is not the same bug as #5757 / PR #5758 (that was an x64/SSE out‑of‑bounds crash caused by a bad bitmap‑activation check for needles containing elements ≥ 256). This one is ARM64‑only, is a hang rather than a crash, is triggered by needles whose elements are all < 256, and lives in different code introduced later.
Reproducer (attached:
find_first_of_arm64_hang.cpp)Build:
cl /EHsc /std:c++20 find_first_of_arm64_hang.cppresult = 18446744073709551615(npos) and exitsReproduces in both
/O2and/Odbuilds (the routine is in the prebuilt STL, not in headers).Call path
std::u16string_view::find_first_of→
_Traits_find_first_of(<__msvc_string_view.hpp>, guarded by_VECTORIZED_FIND_FIRST_OF)→
__std_find_first_of_trivial_pos_2→
_Find_meow_of::_First_of::_Dispatch_pos<uint16_t, _Any_of>→
_Dispatch_pos_neon<uint16_t, _Any_of>→
_Find_meow_of::_Bitmap_impl::_Impl_first_neon<uint16_t, _Any_of>← infinite loop hereRoot cause
stl/src/vector_algorithms.cpp,_Impl_first_neon(currentmain, ~line 6342). The 16‑wide Neon loop runs over_Haystack_length & ~size_t{15}, then a scalar tail handles the remaining 1–15 elements:In a
do/while,continue(A) transfers control to the condition (C), skipping the++_Ix;at (B). So as soon as a tail element satisfies_Val >= 256under_Predicate::_Any_of(which is exactly whatfind_first_of/find_first_not_of’s any‑of direction uses),_Ixis frozen and_Ix != _Haystack_lengthstays true forever.For 1‑byte
_Tytheif constexpr (sizeof(_Val) > 1)branch is discarded, so__std_find_first_of_trivial_pos_1(std::string) is unaffected.Why it strikes ordinary inputs
For
sizeof(_Ty) == 2,_Use_bitmap_neonselects this worker once the haystack is>= 96code units (for a 4‑element needle). Any UTF‑16 search where:u"<>&\""),>= 256(any non‑Latin‑1 character: Cyrillic, CJK, an emoji surrogate half, etc.)…hangs. That is an extremely common shape — e.g.
QString::toHtmlEscaped()on any longer message whose tail contains a non‑Latin character.Sibling functions are correct (suggested shape for the fix)
_Impl_last_neon(thefind_last_ofdirection, ~line 6402) is safe: it does--_Ix;as the first statement inside itsdobody, so itscontinuestill advances._Impl_first_scalar(the x64/SSE scalar tail) is safe: it usesfor (size_t _Ix = 0; _Ix != _Haystack_length; ++_Ix), wherecontinueruns the loop’s step expression.Fix: make the increment unconditional in
_Impl_first_neon— e.g. rewrite the tail as afor (size_t _Ix = _Vec_end; _Ix != _Haystack_length; ++_Ix)loop (matching_Impl_first_scalar) and delete the bottom++_Ix;, or move the advance into the_Val >= 256branch.Affected configurations
#if defined(_M_ARM64) || defined(_M_ARM64EC)Neon path). x64/x86 unaffected.wchar_t,char16_t,char32_t); 1‑byte (char) is fine.find_first_of(_Any_of).find_first_not_of(_None_of)returns instead ofcontinue, so it is not stuck (but_Any_ofis the common case).find_first_of#6115 “Add Neon bitmap implementation offind_first_of” (merged 2026‑02‑28, commit9980b75); per the STL changelog this shipped in the MSVC 14.51 toolset (VS 2022 17.14.x). Still present onmainat the time of this report.Real‑world impact
Surfaced as a 100%‑reproducible UI‑thread deadlock in Telegram Desktop (official ARM64 build) when copying selected messages with Ctrl+C:
QString::toHtmlEscaped()→std::u16string_view::find_first_of(u"<>&\"")over message text whose tail contains a non‑Latin character. Any code usingQString::toHtmlEscaped()orfind_first_of/find_first_not_ofon UTF‑16 is exposed on ARM64.Related issue: telegramdesktop/tdesktop#30867
find_first_of_arm64_hang.cpp