Fix re.fullmatch POSSESSIVE_REPEAT by wvmscs · Pull Request #7187 · RustPython/RustPython · GitHub
Skip to content

Fix re.fullmatch POSSESSIVE_REPEAT#7187

Merged
youknowone merged 8 commits intoRustPython:mainfrom
wvmscs:closes-#7183
Feb 20, 2026
Merged

Fix re.fullmatch POSSESSIVE_REPEAT#7187
youknowone merged 8 commits intoRustPython:mainfrom
wvmscs:closes-#7183

Conversation

@wvmscs
Copy link
Copy Markdown
Contributor

@wvmscs wvmscs commented Feb 17, 2026

This PR implements the corrections for the POSSESSIVE_REPEAT i.e. "(x)++)" given in #7183.

3 tests in the file test_re.py (in Lib/test) failed for unexpected success

UNEXPECTED SUCCESS: test_bug_gh91616 (test.test_re.ReTests.test_bug_gh91616)
UNEXPECTED SUCCESS: test_fullmatch_atomic_grouping (test.test_re.ReTests.test_fullmatch_atomic_grouping)
UNEXPECTED SUCCESS: test_fullmatch_possessive_quantifiers (test.test_re.ReTests.test_fullmatch_possessive_quantifiers)

I believe this was the old behavior before applying the correction, as the result is now the same as CPython 3.14.3 .
So, I also corrected the test file and add one new test for POSSESSIVE_REPEAT.

please review.
best regards

Summary by CodeRabbit

  • Bug Fixes

    • Improved handling of possessive repeat patterns in regex matching.
    • Fixed context management for atomic group and possessive repeat operations to ensure correct behavior.
  • Tests

    • Added comprehensive tests for possessive repeat functionality with full-match scenarios.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 17, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 17, 2026

Code has been automatically formatted

The code in this PR has been formatted using:

  • ruff format
    Please pull the latest changes before pushing again:
git pull origin closes-#7183

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/sre_engine/Cargo.toml (1)

17-19: Remove the redundant explicit [[test]] entry or document the reason for making it explicit.

In Rust 2024 edition with autotests = true (the default), Cargo automatically discovers tests/tests.rs as an integration test target. The explicit [[test]] name = "tests" without a path field resolves to the same file that would already be discovered, making this entry redundant. If the intent is to make it explicit for documentation or consistency with the [[bench]] declaration above, that's acceptable—but if it serves no purpose, removing it would clean up unnecessary boilerplate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/sre_engine/Cargo.toml` around lines 17 - 19, Remove the redundant
explicit test table: delete the `[[test]]` block that only contains `name =
"tests"` (the `[[test]] name = "tests"` entry) from Cargo.toml, or if you intend
it for documentation/consistency, add a short comment above that block
explaining why it's explicit (e.g., clarifying it's intentional to mirror the
`[[bench]]` declaration); do not add a `path` field since Cargo already
autodiscovers tests in Rust 2024 with `autotests = true`.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/sre_engine/src/engine.rs`:
- Around line 473-476: Typo fix: change the comment text "herited" to
"inherited" in all occurrences; locate the comments near the
ctx.next_offset(...) usage and the 'context label (e.g., the comment above the
let mut next = ctx.next_offset(4, Jump::PossessiveRepeat2); block and the two
other comments that reference "herited" around similar code paths) and update
the three comment instances to read "inherited".

---

Nitpick comments:
In `@crates/sre_engine/Cargo.toml`:
- Around line 17-19: Remove the redundant explicit test table: delete the
`[[test]]` block that only contains `name = "tests"` (the `[[test]] name =
"tests"` entry) from Cargo.toml, or if you intend it for
documentation/consistency, add a short comment above that block explaining why
it's explicit (e.g., clarifying it's intentional to mirror the `[[bench]]`
declaration); do not add a `path` field since Cargo already autodiscovers tests
in Rust 2024 with `autotests = true`.

Comment thread crates/sre_engine/src/engine.rs Outdated
Copy link
Copy Markdown
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general. Thank you for contributing! Please also check coderabbitai's review comments.

Comment thread Lib/test/test_re.py Outdated
Comment thread Lib/test/test_re.py Outdated
Comment thread Lib/test/test_re.py Outdated
Comment thread Lib/test/test_re.py Outdated
@youknowone youknowone changed the title Closes #7183 Fix re.fullmatch POSSESSIVE_REPEAT Feb 17, 2026
Copy link
Copy Markdown
Contributor Author

@wvmscs wvmscs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended changes done

@wvmscs wvmscs requested a review from youknowone February 18, 2026 10:06
@youknowone
Copy link
Copy Markdown
Member

@wvmscs probably you forgot to push the changes.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
extra_tests/snippets/stdlib_re.py (1)

80-80: Unescaped . in pattern matches any character, not only literal dot.

(?:.[0-9]+) will match 1X25Y38 just as readily as 1.25.38. The test still exercises the correct code path for the issue, but using \. better expresses intent.

✏️ Proposed fix
-assert re.fullmatch(r"([0-9]++(?:.[0-9]+)*+)", "1.25.38")
-assert re.fullmatch(r"([0-9]++(?:.[0-9]+)*+)", "1.25.38").group(0) == "1.25.38"
+assert re.fullmatch(r"([0-9]++(?:\.[0-9]+)*+)", "1.25.38")
+assert re.fullmatch(r"([0-9]++(?:\.[0-9]+)*+)", "1.25.38").group(0) == "1.25.38"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extra_tests/snippets/stdlib_re.py` at line 80, The regex in the re.fullmatch
call uses an unescaped dot in the non-capturing group (pattern
"([0-9]++(?:.[0-9]+)*+)"), so it will match any character instead of a literal
period; update the pattern passed to re.fullmatch to escape the dot in the group
(i.e., change the "(?:.[0-9]+)" portion to "(?:\.[0-9]+)") so the test enforces
literal dots (reference the re.fullmatch invocation and the pattern string).
crates/sre_engine/src/engine.rs (1)

473-473: Comments read as change descriptions rather than code explanations.

// modified next.toplevel from inherited to false describes what was changed (commit-message style) rather than why the invariant exists. A future reader maintaining this code won't immediately understand the significance.

✏️ Suggested wording (apply to all three sites)
-            // modified next.toplevel from inherited to false
-            let mut next = ctx.next_offset(4, Jump::PossessiveRepeat2);
-            next.toplevel = false;
+            let mut next = ctx.next_offset(4, Jump::PossessiveRepeat2);
+            next.toplevel = false; // sub-pattern must not apply match_all/must_advance constraints
-            let mut next = ctx.next_offset(4, Jump::PossessiveRepeat4);
-            next.toplevel = false; // modified next.toplevel from inherited to false
+            let mut next = ctx.next_offset(4, Jump::PossessiveRepeat4);
+            next.toplevel = false; // sub-pattern must not apply match_all/must_advance constraints
-            let mut next_ctx = ctx.next_offset(2, Jump::AtomicGroup1);
-            next_ctx.toplevel = false; // modified next.toplevel from inherited to false
+            let mut next_ctx = ctx.next_offset(2, Jump::AtomicGroup1);
+            next_ctx.toplevel = false; // sub-pattern must not apply match_all/must_advance constraints

Also applies to: 501-501, 841-841

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/sre_engine/src/engine.rs` at line 473, Replace the commit-style
comment "// modified next.toplevel from inherited to false" with a concise
explanation of the invariant and rationale: state why next.toplevel must be set
to false (what property/state would break if it remained inherited), what
callers or code paths rely on this being false, and any assumptions about
downstream behavior or performance; apply the same explanatory replacement at
the other two occurrences that reference next.toplevel so future maintainers
understand the intent and not just the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@extra_tests/snippets/stdlib_re.py`:
- Line 81: The assertion uses re.fullmatch(...).groups(0) which returns a tuple,
causing the comparison to always fail; change the call to use the singular
.group(0) (or .group(1) if you prefer the first capture) on the Match object
returned by re.fullmatch so the expression compares a string ("1.25.38") to a
string rather than a tuple; locate the call to re.fullmatch in the snippet and
replace .groups(0) with .group(0).

---

Nitpick comments:
In `@crates/sre_engine/src/engine.rs`:
- Line 473: Replace the commit-style comment "// modified next.toplevel from
inherited to false" with a concise explanation of the invariant and rationale:
state why next.toplevel must be set to false (what property/state would break if
it remained inherited), what callers or code paths rely on this being false, and
any assumptions about downstream behavior or performance; apply the same
explanatory replacement at the other two occurrences that reference
next.toplevel so future maintainers understand the intent and not just the
change.

In `@extra_tests/snippets/stdlib_re.py`:
- Line 80: The regex in the re.fullmatch call uses an unescaped dot in the
non-capturing group (pattern "([0-9]++(?:.[0-9]+)*+)"), so it will match any
character instead of a literal period; update the pattern passed to re.fullmatch
to escape the dot in the group (i.e., change the "(?:.[0-9]+)" portion to
"(?:\.[0-9]+)") so the test enforces literal dots (reference the re.fullmatch
invocation and the pattern string).

Comment thread extra_tests/snippets/stdlib_re.py Outdated
Copy link
Copy Markdown
Contributor Author

@wvmscs wvmscs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request review

Copy link
Copy Markdown
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! Welcome to RustPython project

Comment thread crates/sre_engine/Cargo.toml Outdated
@github-actions
Copy link
Copy Markdown
Contributor

@youknowone youknowone merged commit dfc5de7 into RustPython:main Feb 20, 2026
14 checks passed
youknowone pushed a commit to youknowone/RustPython that referenced this pull request Mar 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants