Add sys._stdlib_dir by youknowone · Pull Request #6798 · RustPython/RustPython · GitHub
Skip to content

Add sys._stdlib_dir#6798

Merged
youknowone merged 1 commit intoRustPython:mainfrom
youknowone:stdlib_dir
Jan 19, 2026
Merged

Add sys._stdlib_dir#6798
youknowone merged 1 commit intoRustPython:mainfrom
youknowone:stdlib_dir

Conversation

@youknowone
Copy link
Copy Markdown
Member

@youknowone youknowone commented Jan 19, 2026

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved standard library path resolution, particularly on Windows with enhanced lib directory detection for both text-file and symlink-based paths
  • Refactor

    • Centralized standard library directory configuration through interpreter state management

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 19, 2026

@github-actions
Copy link
Copy Markdown
Contributor

Code has been automatically formatted

The code in this PR has been formatted using:

  • cargo fmt --all
    Please pull the latest changes before pushing again:
git pull origin stdlib_dir

@youknowone youknowone marked this pull request as ready for review January 19, 2026 14:44
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/pylib/build.rs`:
- Around line 14-31: The build script currently calls
canonicalize(...).to_str().unwrap(), which can panic on non‑UTF‑8 Windows paths;
change the emission to use a lossy conversion from the canonicalized Path (e.g.,
Path::to_string_lossy on the value returned by std::fs::canonicalize(&lib_path))
and print that into the cargo:rustc-env=win_lib_path value, and if
std::fs::canonicalize(&lib_path) returns an Err emit a visible build warning via
println!("cargo:warning=...") instead of silently skipping so missing
win_lib_path is obvious; update the code paths referencing canonicalized_path
and to_str().unwrap() accordingly.

Comment thread crates/pylib/build.rs
Comment on lines +14 to +31
if cfg!(windows) {
// On Windows, the Lib entry can be either:
// 1. A text file containing the relative path (git without symlink support)
// 2. A proper symlink (git with symlink support)
// We handle both cases to resolve to the actual Lib directory.
let lib_path = if let Ok(real_path) = std::fs::read_to_string("Lib") {
// Case 1: Text file containing relative path
std::path::PathBuf::from(real_path.trim())
} else {
// Case 2: Symlink or directory - canonicalize directly
std::path::PathBuf::from("Lib")
};

if let Ok(canonicalized_path) = std::fs::canonicalize(&lib_path) {
// Strip the extended path prefix (\\?\) that canonicalize adds on Windows
let path_str = canonicalized_path.to_str().unwrap();
let path_str = path_str.strip_prefix(r"\\?\").unwrap_or(path_str);
println!("cargo:rustc-env=win_lib_path={path_str}");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid panic on non‑UTF‑8 Windows paths.
to_str().unwrap() can panic on valid Windows paths containing non‑UTF‑8 sequences, breaking the build. Prefer a lossy conversion and consider warning when canonicalization fails so missing win_lib_path isn’t silent.

🔧 Proposed safer emission
-        if let Ok(canonicalized_path) = std::fs::canonicalize(&lib_path) {
-            // Strip the extended path prefix (\\?\) that canonicalize adds on Windows
-            let path_str = canonicalized_path.to_str().unwrap();
-            let path_str = path_str.strip_prefix(r"\\?\").unwrap_or(path_str);
-            println!("cargo:rustc-env=win_lib_path={path_str}");
-        }
+        if let Ok(canonicalized_path) = std::fs::canonicalize(&lib_path) {
+            // Strip the extended path prefix (\\?\) that canonicalize adds on Windows
+            let path_str = canonicalized_path.to_string_lossy();
+            let path_str = path_str.strip_prefix(r"\\?\").unwrap_or(&path_str);
+            println!("cargo:rustc-env=win_lib_path={path_str}");
+        } else {
+            println!("cargo:warning=failed to canonicalize Lib path: {:?}", lib_path);
+        }
🤖 Prompt for AI Agents
In `@crates/pylib/build.rs` around lines 14 - 31, The build script currently calls
canonicalize(...).to_str().unwrap(), which can panic on non‑UTF‑8 Windows paths;
change the emission to use a lossy conversion from the canonicalized Path (e.g.,
Path::to_string_lossy on the value returned by std::fs::canonicalize(&lib_path))
and print that into the cargo:rustc-env=win_lib_path value, and if
std::fs::canonicalize(&lib_path) returns an Err emit a visible build warning via
println!("cargo:warning=...") instead of silently skipping so missing
win_lib_path is obvious; update the code paths referencing canonicalized_path
and to_str().unwrap() accordingly.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/codegen/src/compile.rs`:
- Around line 6983-6986: The DictMerge path currently uses the first mapping as
the accumulator which can mutate a caller-provided **kwargs dict; change
compile.rs so that when merging multiple mappings (the loop around emit!(self,
Instruction::DictMerge { index: 1 })), you first allocate a fresh empty
accumulator and merge each mapping into that fresh map instead of using the
first input as the target; specifically, before the for-loop that emits
Instruction::DictMerge, emit an instruction to create a new empty dict (so the
accumulator is new) and then merge all incoming `**` groups into that fresh
accumulator (update the code that emits Instruction::DictMerge and the
surrounding logic in the same block to use the new accumulator).
🧹 Nitpick comments (1)
crates/vm/src/frame.rs (1)

962-962: Inconsistent downcast pattern with DictUpdate.

DictUpdate (line 934) uses a safe downcast with .expect():

let dict = dict.downcast_ref::<PyDict>().expect("exact dict expected");

But DictMerge uses an unsafe downcast without a SAFETY comment explaining the invariant. Consider using the same safe pattern as DictUpdate for consistency and defensive coding, or at minimum add a SAFETY comment like the other unsafe downcasts in this file (e.g., ListAppend, ListExtend).

♻️ Suggested fix for consistency
-                let dict: &Py<PyDict> = unsafe { dict_ref.downcast_unchecked_ref() };
+                let dict: &Py<PyDict> = unsafe {
+                    // SAFETY: compiler guarantees correct type
+                    dict_ref.downcast_unchecked_ref()
+                };

Or use the safe pattern:

-                let dict: &Py<PyDict> = unsafe { dict_ref.downcast_unchecked_ref() };
+                let dict = dict_ref.downcast_ref::<PyDict>().expect("exact dict expected");

Comment on lines +6983 to +6986
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid mutating user-provided **kwargs when merging.
DictMerge uses the first mapping as the accumulator. If the first group is **kwargs, this mutates the caller’s dict when multiple ** groups exist (observable vs CPython, which builds a fresh dict). Please ensure the accumulator is always a new map before merging any ** inputs.

🔧 Possible fix (merge into a fresh accumulator)
-        let mut size = 0;
-        let groupby = keywords.iter().chunk_by(|e| e.arg.is_none());
-        for (is_unpacking, sub_keywords) in &groupby {
-            if is_unpacking {
-                for keyword in sub_keywords {
-                    self.compile_expression(&keyword.value)?;
-                    size += 1;
-                }
-            } else {
-                let mut sub_size = 0;
-                for keyword in sub_keywords {
-                    if let Some(name) = &keyword.arg {
-                        self.emit_load_const(ConstantData::Str {
-                            value: name.as_str().into(),
-                        });
-                        self.compile_expression(&keyword.value)?;
-                        sub_size += 1;
-                    }
-                }
-                emit!(self, Instruction::BuildMap { size: sub_size });
-                size += 1;
-            }
-        }
-        if size > 1 {
-            // Merge all dicts: first dict is accumulator, merge rest into it
-            for _ in 1..size {
-                emit!(self, Instruction::DictMerge { index: 1 });
-            }
-        }
+        let mut have_accumulator = false;
+        let groupby = keywords.iter().chunk_by(|e| e.arg.is_none());
+        for (is_unpacking, sub_keywords) in &groupby {
+            if !have_accumulator {
+                emit!(self, Instruction::BuildMap { size: 0 }); // fresh accumulator
+                have_accumulator = true;
+            }
+            if is_unpacking {
+                for keyword in sub_keywords {
+                    self.compile_expression(&keyword.value)?;
+                    emit!(self, Instruction::DictMerge { index: 1 });
+                }
+            } else {
+                let mut sub_size = 0;
+                for keyword in sub_keywords {
+                    if let Some(name) = &keyword.arg {
+                        self.emit_load_const(ConstantData::Str {
+                            value: name.as_str().into(),
+                        });
+                        self.compile_expression(&keyword.value)?;
+                        sub_size += 1;
+                    }
+                }
+                emit!(self, Instruction::BuildMap { size: sub_size });
+                emit!(self, Instruction::DictMerge { index: 1 });
+            }
+        }
🤖 Prompt for AI Agents
In `@crates/codegen/src/compile.rs` around lines 6983 - 6986, The DictMerge path
currently uses the first mapping as the accumulator which can mutate a
caller-provided **kwargs dict; change compile.rs so that when merging multiple
mappings (the loop around emit!(self, Instruction::DictMerge { index: 1 })), you
first allocate a fresh empty accumulator and merge each mapping into that fresh
map instead of using the first input as the target; specifically, before the
for-loop that emits Instruction::DictMerge, emit an instruction to create a new
empty dict (so the accumulator is new) and then merge all incoming `**` groups
into that fresh accumulator (update the code that emits Instruction::DictMerge
and the surrounding logic in the same block to use the new accumulator).

@youknowone youknowone merged commit b76906a into RustPython:main Jan 19, 2026
13 checks passed
@youknowone youknowone deleted the stdlib_dir branch January 19, 2026 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant