Replace unsafe pointer cast with PyMutex in PyCode by youknowone · Pull Request #7055 · RustPython/RustPython · GitHub
Skip to content

Replace unsafe pointer cast with PyMutex in PyCode#7055

Merged
youknowone merged 1 commit intoRustPython:mainfrom
youknowone:pycode-path
Feb 9, 2026
Merged

Replace unsafe pointer cast with PyMutex in PyCode#7055
youknowone merged 1 commit intoRustPython:mainfrom
youknowone:pycode-path

Conversation

@youknowone
Copy link
Copy Markdown
Member

@youknowone youknowone commented Feb 8, 2026

Add source_path: PyMutex<&'static PyStrInterned> field to PyCode for interior mutability, replacing the UB-inducing #[allow(invalid_reference_casting)] + write_volatile pattern in update_code_filenames. Update all 12 read sites across the codebase to use the new source_path() method.

Summary by CodeRabbit

  • Refactor
    • Improved thread-safety and safer handling of code object source paths across the runtime by replacing direct internal field access and in-place mutations with controlled accessors and setters, reducing unsafe mutations and standardizing path retrieval in traceback, import, and warning/reporting flows.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 8, 2026

@youknowone youknowone marked this pull request as ready for review February 9, 2026 00:26
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/vm/src/builtins/code.rs`:
- Around line 337-351: Replace direct reads of the CodeObject field
"frame.code.source_path" with the PyCode accessor "frame.code.source_path()" so
the mutex-protected value from PyCode::source_path is used; search for
occurrences of "frame.code.source_path" in faulthandler.rs and update them to
call the accessor (leveraging the existing PyCode::source_path and
set_source_path methods).

Comment thread crates/vm/src/builtins/code.rs
Add `source_path: PyMutex<&'static PyStrInterned>` field to
`PyCode` for interior mutability, replacing the UB-inducing
`#[allow(invalid_reference_casting)]` + `write_volatile` pattern
in `update_code_filenames`. Update all read sites across the
codebase to use the new `source_path()` accessor method.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/stdlib/src/faulthandler.rs`:
- Around line 254-256: dump_frame_from_raw is called from the signal handler
faulthandler_fatal_error but calls frame.code.source_path() which acquires a
PyMutex and can deadlock; make this signal-safe by avoiding any mutex
acquisition: either add a signal-safe cached string pointer on PyCode (e.g., an
atomic/AtomicPtr holding a nul-terminated C string updated outside signal
handlers and read here) and read that in dump_frame_from_raw, or replace the
call to source_path() with a non-blocking attempt (call try_lock() on the
PyCode/source_path mutex and if it fails print a fallback like "<unknown>" or
the raw pointer/address), updating references to Frame/ PyCode/ source_path
accordingly so dump_frame_from_raw never blocks in a signal context.

Comment on lines 254 to 256
fn dump_frame_from_raw(fd: i32, frame: &Frame) {
let filename = frame.code.source_path.as_str();
let filename = frame.code.source_path().as_str();
let funcname = frame.code.obj_name.as_str();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find PyMutex definition
echo "=== PyMutex Definition ==="
rg -n 'type PyMutex|struct PyMutex' --type rust -A3 -B1

echo ""
echo "=== source_path method signature and impl ==="
# Search for source_path method definition
rg -n 'fn source_path' --type rust -A8

echo ""
echo "=== Check signal handler context ==="
# Find faulthandler_fatal_error to confirm it's a signal handler
rg -n 'faulthandler_fatal_error|extern "C"' crates/stdlib/src/faulthandler.rs -B2 -A5

Repository: RustPython/RustPython

Length of output: 3633


🏁 Script executed:

#!/bin/bash
# Let's check the specific file to understand the context better
echo "=== Checking faulthandler.rs structure ==="
cat -n crates/stdlib/src/faulthandler.rs | head -300 | tail -100

Repository: RustPython/RustPython

Length of output: 4032


🏁 Script executed:

#!/bin/bash
# Check what the Frame structure looks like and where source_path comes from
echo "=== Frame structure and code field ==="
rg -n 'struct Frame|pub struct Frame' --type rust -A10

echo ""
echo "=== PyCode structure related to source_path ==="
rg -n 'pub struct PyCode|source_path' --type rust -B2 -A8

Repository: RustPython/RustPython

Length of output: 48965


source_path() acquires a mutex inside a signal handler — potential deadlock.

dump_frame_from_raw (line 254) is reachable from faulthandler_fatal_error, which is a signal handler. Calling source_path() at line 255 locks a PyMutex. If the signal fires while the same PyCode's source_path mutex is held elsewhere, this causes a deadlock.

The comment on line 252 explicitly documents this function as "(signal-safe)", which is now violated. Consider either:

  • Storing a separate signal-safe copy (e.g., AtomicPtr or atomic reference) for signal-handler reads, or
  • Using try_lock() with a fallback (e.g., print "<unknown>" if the lock is contended).
🤖 Prompt for AI Agents
In `@crates/stdlib/src/faulthandler.rs` around lines 254 - 256,
dump_frame_from_raw is called from the signal handler faulthandler_fatal_error
but calls frame.code.source_path() which acquires a PyMutex and can deadlock;
make this signal-safe by avoiding any mutex acquisition: either add a
signal-safe cached string pointer on PyCode (e.g., an atomic/AtomicPtr holding a
nul-terminated C string updated outside signal handlers and read here) and read
that in dump_frame_from_raw, or replace the call to source_path() with a
non-blocking attempt (call try_lock() on the PyCode/source_path mutex and if it
fails print a fallback like "<unknown>" or the raw pointer/address), updating
references to Frame/ PyCode/ source_path accordingly so dump_frame_from_raw
never blocks in a signal context.

@youknowone youknowone merged commit dc2d235 into RustPython:main Feb 9, 2026
13 checks passed
@youknowone youknowone deleted the pycode-path branch February 9, 2026 05:01
@jaens
Copy link
Copy Markdown

jaens commented Feb 9, 2026

@youknowone doesn't PyMutex add quite a bit of memory overhead? The mutex functionality is not really even used here, as there are no non-atomic operations done inside the lock...

Since the lifetime is static anyway, I think something like atomic_ref should work (or AtomicPtr from std, though that would still require a bit of unsafe).

What the update code does is actually an atomic compare and exchange, I think.

@youknowone
Copy link
Copy Markdown
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants