install: fix broken dep symlinks in workspace members under a symlinked directory#29601
install: fix broken dep symlinks in workspace members under a symlinked directory#29601
Conversation
…ed directory When a workspace member's declared path traverses a symlink (e.g. a `repos/ext` subdir that symlinks to an out-of-tree directory), the isolated installer wrote each dep symlink target as a relative path computed from the logical workspace string — but the symlink itself landed at the symlink's real target. The `..`-prefix was anchored to the deeper logical path, so from the physical location it walked past the project root and pointed at nothing. `bun install` reported success, `bun <app>` errored with ENOENT on the dep. Resolve the workspace dir's real path once per workspace entry and anchor each dep link target at that canonical directory. The symlink's dest stays logical so the kernel follows the same chain it always did.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughInstaller resolves workspace member directories to their canonical realpaths when members live under symlinked directories and anchors package-local dependency symlink targets to that realpath. A regression test verifies runnable dependency links for a symlinked workspace member and install-time syscall failures now surface as install failures. Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
The canonical-path fix hard-coded the from-base at `<real_ws>/node_modules`, which is one level too shallow for a scoped dep: the symlink actually lives at `<real_ws>/node_modules/@scope/<name>`, so the computed `..`-prefix was missing one hop and scoped imports from the workspace dangled. Mirror the fallback's `undo(1)` on the canonical path (append the full dep_name including the scope segment, then trim one component) so the from-base matches the symlink's real parent directory for both scoped and unscoped names. Test extended to exercise both.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/install/isolated_install/Installer.zig`:
- Around line 905-915: The current code in the block using bun.sys.open and
bun.sys.getFdPath silently falls back to the label :resolve_canonical on error,
which causes dest.relative(&dep_store_path) to recreate the broken symlink
target; instead, handle failures by surfacing .symlink_dependencies (e.g.,
return or set the dependency representation to .symlink_dependencies) rather
than breaking to :resolve_canonical. Modify the error arms of the two switch
statements on bun.sys.open(...) and bun.sys.getFdPath(...) (the dir_fd and
real_buf handling) to propagate or assign the symlink dependency case
(.symlink_dependencies) and exit the function/flow appropriately, ensuring the
code that calls or inspects symlink handling sees .symlink_dependencies instead
of performing the old dest.relative logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 77504515-28ef-4388-a8eb-0e3aca90a1d7
📒 Files selected for processing (2)
src/install/isolated_install/Installer.zigtest/cli/install/isolated-install.test.ts
…path branch Two follow-ups to the symlinked-workspace symlink fix: - If opening or `getFdPath`ing the workspace dir fails, return `.symlink_dependencies` with the sys error instead of silently falling back to the logical-path computation — that fallback is exactly what produced the broken links this branch was added to fix, so swallowing an error here would let the original bug resurface on unusual filesystems. - The canonical-path branch carried a copy of the nested-`node_modules` collision check from the fallback, but `entryStoreNodeModulesPackageName` always returns null for `.workspace` entries, so that inner block could never fire. Remove it and note why in a comment. Also tighten the explanatory comments and use `runBunInstall` in the test.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/install/isolated_install/Installer.zig`:
- Around line 908-912: The current code places a large bun.PathBuffer (real_buf)
on the installer task stack before calling bun.sys.getFdPath; instead borrow a
buffer from bun.path_buffer_pool (e.g., call bun.path_buffer_pool.acquire() or
the pool API used elsewhere) into a bun.PathBuffer variable, pass that pooled
buffer to bun.sys.getFdPath, and ensure you release/return the buffer to
bun.path_buffer_pool in both the success and error paths (including the .err
branch that returns .failure(.{ .symlink_dependencies = err })). Update the code
around real_buf, the switch on bun.sys.getFdPath(), and any early returns so the
pooled buffer is always released.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7671c880-3019-4b6c-bbf5-121b49165ec4
📒 Files selected for processing (1)
src/install/isolated_install/Installer.zig
| var real_buf: bun.PathBuffer = undefined; | ||
| const real = switch (bun.sys.getFdPath(dir_fd, &real_buf)) { | ||
| .result => |r| r, | ||
| .err => |err| return .failure(.{ .symlink_dependencies = err }), | ||
| }; |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Move real_buf off the task stack.
bun.PathBuffer is MAX_PATH_BYTES-sized, so putting real_buf on the installer task stack adds a large per-thread allocation in a hot path. Borrow it from bun.path_buffer_pool before calling bun.sys.getFdPath().
Proposed fix
- var real_buf: bun.PathBuffer = undefined;
- const real = switch (bun.sys.getFdPath(dir_fd, &real_buf)) {
+ const real_buf = bun.path_buffer_pool.get();
+ defer bun.path_buffer_pool.put(real_buf);
+ const real = switch (bun.sys.getFdPath(dir_fd, real_buf)) {
.result => |r| r,
.err => |err| return .failure(.{ .symlink_dependencies = err }),
};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| var real_buf: bun.PathBuffer = undefined; | |
| const real = switch (bun.sys.getFdPath(dir_fd, &real_buf)) { | |
| .result => |r| r, | |
| .err => |err| return .failure(.{ .symlink_dependencies = err }), | |
| }; | |
| const real_buf = bun.path_buffer_pool.get(); | |
| defer bun.path_buffer_pool.put(real_buf); | |
| const real = switch (bun.sys.getFdPath(dir_fd, real_buf)) { | |
| .result => |r| r, | |
| .err => |err| return .failure(.{ .symlink_dependencies = err }), | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/install/isolated_install/Installer.zig` around lines 908 - 912, The
current code places a large bun.PathBuffer (real_buf) on the installer task
stack before calling bun.sys.getFdPath; instead borrow a buffer from
bun.path_buffer_pool (e.g., call bun.path_buffer_pool.acquire() or the pool API
used elsewhere) into a bun.PathBuffer variable, pass that pooled buffer to
bun.sys.getFdPath, and ensure you release/return the buffer to
bun.path_buffer_pool in both the success and error paths (including the .err
branch that returns .failure(.{ .symlink_dependencies = err })). Update the code
around real_buf, the switch on bun.sys.getFdPath(), and any early returns so the
pooled buffer is always released.
|
CI failed only on Line 778 asserts |
… GC flake Build 47298 failed with: - windows-11-aarch64: socket.test.ts 'should not leak memory' (TCPSocket <= 3 but got 4 — known Windows GC-timing flake, tracked via prior 'widen threshold' commits bd347a4 and 1dcd25d) - darwin-13/14 x64/aarch64: agents never picked up, jobs expired None related to this PR. The symlinked-workspace test passed cleanly on every platform it ran on (44/44 on debian-13-x64).
| // If the workspace path traverses a symlink, dep symlinks | ||
| // live under the realpath, so their relative targets must | ||
| // be computed from there — not from the logical string, | ||
| // which would walk past the project root. Null when the | ||
| // path is already canonical so the hot path stays | ||
| // syscall-free. |
There was a problem hiding this comment.
🟡 nit: the comment's "Null when the path is already canonical so the hot path stays syscall-free" (and the PR description's "the extra open is skipped") read as if the open+getFdPath probe is gated on canonical-ness, but it runs unconditionally for every workspace entry — only the assignment at line 914 is gated. The cost is negligible (3 syscalls × workspace count, dwarfed by the per-dep symlink()s below), so this is just a wording fix: consider rephrasing to e.g. "Null when the realpath matches, so the per-dep loop takes the existing fallback unchanged."
Extended reasoning...
What's inaccurate
The comment at Installer.zig:889-894 ends with: "Null when the path is already canonical so the hot path stays syscall-free." The PR description similarly says: "When the workspace path is already canonical, the extra open is skipped — the existing hot path runs unchanged." Both sentences causally link "path is already canonical" → "no extra syscalls". But the implementation at lines 897-919 unconditionally executes bun.sys.open() (line 902), bun.sys.getFdPath() (line 909), and dir_fd.close() (line 906 deferred) for every entry where pkg_res.tag == .workspace. The eqlLong check at line 914 happens after those syscalls and only gates whether canonical_entry_node_modules gets populated — not whether the realpath probe runs.
Step-by-step proof
For a non-symlinked workspace member packages/pkg1:
- Line 897:
pkg_res.tag == .workspaceis true → enter the block. - Line 902:
bun.sys.open("<top>/packages/pkg1", ...)— syscall 1. - Line 909:
bun.sys.getFdPath(dir_fd, ...)— syscall 2 (e.g.readlink(/proc/self/fd/N)on Linux,fcntl(F_GETPATH)on macOS). - Line 914:
realequalsworkspace_abs.slice(), socanonical_entry_node_modulesstaysnull. - Line 906 (deferred):
dir_fd.close()— syscall 3. - The per-dep loop then takes the fallback
dest.relative(...)branch — pure string math, identical to pre-PR behavior.
So the canonical-path workspace pays 3 syscalls it didn't pay before, contradicting both the comment and the PR description. There is no code path where "the extra open is skipped" for a canonical workspace.
Addressing the alternative reading
A counter-argument is that "hot path" refers to non-workspace entries (npm/git/tarball — the vast majority of store entries), which skip the entire if (pkg_res.tag == .workspace) block and genuinely add zero syscalls. That reading makes the implementation defensible, but it doesn't fit the sentence: "Null when the path is already canonical so the hot path stays syscall-free" explicitly conditions on "path is already canonical", which is a property only meaningful inside the workspace branch (non-workspace entries have no canonical-vs-logical distinction here). If the intended meaning were "non-workspace entries are unaffected", the sentence wouldn't mention canonical-ness at all. The PR description's "the extra open is skipped" is even more explicit and is simply not what the code does.
It's also worth noting that both branches of the per-dep target: block (lines 968-986) are pure string arithmetic — neither does syscalls — so "syscall-free" doesn't actually distinguish the null vs. non-null cases inside the loop either.
Why it's only a nit
The perf impact is immaterial: 3 extra syscalls × workspace-member count, executed once per install task, in a step whose loop body immediately below issues a symlink() syscall per dependency. Even a 1000-workspace monorepo adds ~3000 syscalls spread across thread-pool workers — noise. And the suggested mitigation of "lstat each path component first" would typically cost more syscalls than the current open+getFdPath+close, so don't add a pre-check; the unconditional probe is the right implementation choice.
Suggested fix
Just reword the last sentence of the comment so a future reader doesn't assume the probe is conditional, e.g.:
// ... Null when the realpath matched the logical path, so the
// per-dep loop below takes the existing `dest.relative(...)`
// fallback unchanged.(The PR description is out-of-scope for merged code, but if you're updating it anyway, drop "the extra open is skipped".)

Fixes #29598.
Repro
bun installsucceeds.bun app/index.jsthen fails with:The written link:
Four
..s fromexternal/shared-lib/node_moduleswalks past the project root.Cause
src/install/isolated_install/Installer.zig, the.symlink_dependenciestask step:destis built by string-joiningtop_level_dir+ the lockfile'sworkspace_path+"node_modules"+dep_name. For a workspace member listed through a path that traverses a symlink (repos/ext/shared-libhere), that string path resolves physically to a different directory than the joined string suggests — in the repro, the stagingnode_modulesreally lives atexternal/shared-lib/node_modules(three levels under/tmp/bug-repro), notroot/repos/ext/shared-lib/node_modules(four levels).dest.relative(&dep_store_path)is pure string arithmetic, so it counted the deeper logical depth and emitted one extra...The link landed at its real location, but with a content string anchored to a path that doesn't exist from there.
Fix
When the entry is a workspace, resolve the workspace directory's real path once (
open+getFdPath) before the dep loop. Use that as thefromforPath.relativeso the stored target string is relative to where the symlink is physically created, not to the symlinked logical path. Thedestpassed tosys.symlinkstays logical so the kernel follows the same chain it always did. When the workspace path is already canonical, the extraopenis skipped — the existing hot path runs unchanged.Verification
is-number -> ../../../root/node_modules/.bun/...(3..s),bun app/index.jsprintstrue.readlink -fresolves the link to the.bun/store entry...link and runs — no regression.test/cli/install/isolated-install.test.ts.