`<regex>`: Implement small vector optimization for state frames by muellerj2 · Pull Request #6147 · microsoft/STL · GitHub
Skip to content

<regex>: Implement small vector optimization for state frames#6147

Merged
StephanTLavavej merged 11 commits into
microsoft:mainfrom
muellerj2:regex-small-vector-optimization-for-state-frames
Mar 10, 2026
Merged

<regex>: Implement small vector optimization for state frames#6147
StephanTLavavej merged 11 commits into
microsoft:mainfrom
muellerj2:regex-small-vector-optimization-for-state-frames

Conversation

@muellerj2

Copy link
Copy Markdown

Resolves #5969.

This adds a minimal implementation of a vector variant that supports small vector optimization. As its initial storage area, it uses what is left of the stack-allocated buffer after all the other buffers used by the matcher have been initialized. If more memory is necessary, it reallocates like std::vector at the same rate of growth (factor 1.5).

Drive-by change:

  • Handle fancy pointers more cleanly in the buffers (although there can't be any fancy pointers yet because we haven't wired up the allocators).
  • Unify the allocation guards for vector, _Rx_fixed_size_buffer and _Rx_small_vector.

Benchmarks

regex_search
benchmark before [ns] after [ns] speedup
bm_lorem_search/"^bibe"/2 54.4085 56.25 0.97
bm_lorem_search/"^bibe"/3 58.5938 54.4085 1.08
bm_lorem_search/"^bibe"/4 55.8036 53.0134 1.05
bm_lorem_search/"bibe"/2 3138.95 2982.01 1.05
bm_lorem_search/"bibe"/3 6277.9 5580.36 1.12
bm_lorem_search/"bibe"/4 11300.2 11718.8 0.96
bm_lorem_search/"bibe".collate/2 2999.44 2982.01 1.01
bm_lorem_search/"bibe".collate/3 6277.9 5625 1.12
bm_lorem_search/"bibe".collate/4 11997.8 10986.3 1.09
bm_lorem_search/"(bibe)"/2 5440.85 3431.91 1.59
bm_lorem_search/"(bibe)"/3 10602.7 7149.83 1.48
bm_lorem_search/"(bibe)"/4 21484.4 13811.3 1.56
bm_lorem_search/"(bibe)+"/2 9416.81 4464.29 2.11
bm_lorem_search/"(bibe)+"/3 18415.3 9416.85 1.96
bm_lorem_search/"(bibe)+"/4 36900.6 16741.2 2.20
bm_lorem_search/"(?:bibe)+"/2 5998.88 3955.08 1.52
bm_lorem_search/"(?:bibe)+"/3 11718.8 8021.76 1.46
bm_lorem_search/"(?:bibe)+"/4 22495.6 15729.7 1.43
bm_lorem_search/R"(\bbibe)"/2 83701.6 83705.4 1.00
bm_lorem_search/R"(\bbibe)"/3 168795 163923 1.03
bm_lorem_search/R"(\bbibe)"/4 337672 327846 1.03
bm_lorem_search/R"(\Bibe)"/2 188354 187976 1.00
bm_lorem_search/R"(\Bibe)"/3 395570 368968 1.07
bm_lorem_search/R"(\Bibe)"/4 749860 739397 1.01
bm_lorem_search/R"((?=....)bibe)"/2 5156.25 4171.31 1.24
bm_lorem_search/R"((?=....)bibe)"/3 9765.62 8196.15 1.19
bm_lorem_search/R"((?=....)bibe)"/4 19043 14997.2 1.27
bm_lorem_search/R"((?=bibe)....)"/2 4708.44 3766.73 1.25
bm_lorem_search/R"((?=bibe)....)"/3 9207.55 7114.96 1.29
bm_lorem_search/R"((?=bibe)....)"/4 17996.8 14299.7 1.26
bm_lorem_search/R"((?!lorem)bibe)"/2 4289.91 3609.79 1.19
bm_lorem_search/R"((?!lorem)bibe)"/3 8579.76 7533.48 1.14
bm_lorem_search/R"((?!lorem)bibe)"/4 17578.3 13497.4 1.30
regex_match
benchmark before [ns] after [ns] speedup
bm_match_sequence_of_as/"a*"/100 1349.75 1143.97 1.18
bm_match_sequence_of_as/"a*"/200 2563.47 2539.06 1.01
bm_match_sequence_of_as/"a*"/400 4541.02 4080.63 1.11
bm_match_sequence_of_as/"a*?"/100 2490.24 2511.16 0.99
bm_match_sequence_of_as/"a*?"/200 4882.81 4757.26 1.03
bm_match_sequence_of_as/"a*?"/400 9626.07 9207.55 1.05
bm_match_sequence_of_as/"(?:a)*"/100 2008.93 1649.69 1.22
bm_match_sequence_of_as/"(?:a)*"/200 3606.31 3002.93 1.20
bm_match_sequence_of_as/"(?:a)*"/400 6975.45 6138.39 1.14
bm_match_sequence_of_as/"(a)*"/100 2786.7 2301.89 1.21
bm_match_sequence_of_as/"(a)*"/200 5000 4394.53 1.14
bm_match_sequence_of_as/"(a)*"/400 9765.62 8719.31 1.12
bm_match_sequence_of_as/"(a)*?"/100 3690 3138.95 1.18
bm_match_sequence_of_as/"(a)*?"/200 6975.45 6277.9 1.11
bm_match_sequence_of_as/"(a)*?"/400 13113.8 11962.9 1.10
bm_match_sequence_of_as/"(?:b|a)*"/100 5081.62 3923.69 1.30
bm_match_sequence_of_as/"(?:b|a)*"/200 9068.08 8370.54 1.08
bm_match_sequence_of_as/"(?:b|a)*"/400 16043.5 15694.8 1.02
bm_match_sequence_of_as/"(b|a)*"/100 6556.92 7847.38 0.84
bm_match_sequence_of_as/"(b|a)*"/200 18833.9 17159.8 1.10
bm_match_sequence_of_as/"(b|a)*"/400 51618.3 35296.2 1.46
bm_match_sequence_of_as/"(a)(?:b|a)*"/100 4813.07 4098.07 1.17
bm_match_sequence_of_as/"(a)(?:b|a)*"/200 8719.31 8370.54 1.04
bm_match_sequence_of_as/"(a)(?:b|a)*"/400 16741.1 16043.5 1.04
bm_match_sequence_of_as/"(a)(b|a)*"/100 6452.29 8196.15 0.79
bm_match_sequence_of_as/"(a)(b|a)*"/200 15066.9 17264.3 0.87
bm_match_sequence_of_as/"(a)(b|a)*"/400 53125 35992.7 1.48
bm_match_sequence_of_as/"(a)(?:b|a)*c"/100 5580.36 5000 1.12
bm_match_sequence_of_as/"(a)(?:b|a)*c"/200 10498 9416.81 1.11
bm_match_sequence_of_as/"(a)(?:b|a)*c"/400 19949.5 18415.3 1.08
bm_match_sequence_of_9a1b/"(?:ab)"/100 16741.1 14648.4 1.14
bm_match_sequence_of_9a1b/"(?:ab)"/200 33691.9 28599.3 1.18
bm_match_sequence_of_9a1b/"(?:ab)"/400 75334.8 64062.5 1.18

@muellerj2 muellerj2 requested a review from a team as a code owner March 6, 2026 21:45
@github-project-automation github-project-automation Bot moved this to Initial Review in STL Code Reviews Mar 6, 2026
@StephanTLavavej StephanTLavavej added performance Must go faster regex meow is a substring of homeowner labels Mar 6, 2026
@StephanTLavavej StephanTLavavej self-assigned this Mar 6, 2026
Comment thread stl/inc/regex
Comment thread stl/inc/regex
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex Outdated
@StephanTLavavej StephanTLavavej removed their assignment Mar 9, 2026
@StephanTLavavej StephanTLavavej moved this from Initial Review to Merging in STL Code Reviews Mar 9, 2026
@StephanTLavavej

Copy link
Copy Markdown
Member

I'm mirroring this to the MSVC-internal repo. Please notify me if any further changes are pushed, otherwise no action is required.

@StephanTLavavej StephanTLavavej merged commit 26b7ca5 into microsoft:main Mar 10, 2026
49 checks passed
@github-project-automation github-project-automation Bot moved this from Merging to Done in STL Code Reviews Mar 10, 2026
@StephanTLavavej

Copy link
Copy Markdown
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Must go faster regex meow is a substring of homeowner

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

<regex>: Avoid heap allocations for few capturing groups and stack frames

4 participants