[SUB-BISECT] WebKit pr-191: bcd47e7f3fe7 C++-only (asm reverted) by Jarred-Sumner · Pull Request #29620 · oven-sh/bun · GitHub
Skip to content

[SUB-BISECT] WebKit pr-191: bcd47e7f3fe7 C++-only (asm reverted)#29620

Draft
Jarred-Sumner wants to merge 44 commits intomainfrom
webkit-bisect-pr191
Draft

[SUB-BISECT] WebKit pr-191: bcd47e7f3fe7 C++-only (asm reverted)#29620
Jarred-Sumner wants to merge 44 commits intomainfrom
webkit-bisect-pr191

Conversation

@Jarred-Sumner
Copy link
Copy Markdown
Collaborator

Sub-bisect of oven-sh/WebKit#190 / #29393. Tests WebKit autobuild-preview-pr-191-62f63bf3 which keeps the IPInt asm at the working state and re-applies only the 13 C++/generator/assembler files from bcd47e7f3fe7.

Disassembly proved all 728 ipint_* handlers are byte-identical between broken/working Windows builds, so the asm cannot be the cause.

If 🪟 x64 pglite fails here → bug is in OptionsList.h / WasmBBQJIT* / WasmOMGIRGenerator / WasmFunctionParser.h / X86Assembler.h.
If it passes → the disassembly comparison missed something (COFF section/reloc).

Compare against the parallel config-only bisect (WebKit #192). Do not merge.

sosukesuzuki and others added 30 commits April 17, 2026 13:59
JSC's module loader was rewritten from a JS+C++ hybrid to pure C++
(upstream 4a638109b905), JSInternalPromise was removed, and JSModuleLoader
is now a JSCell with a C++ ModuleMap instead of a JS-visible `registry`
JSMap.

Bun-side adaptation:
- GlobalObjectMethodTable hook signatures (JSPromise* return,
  moduleLoaderResolve gains a trailing bool)
- JSInternalPromise -> JSPromise everywhere; the Zig JSInternalPromise
  type is now a transparent alias
- JSType.zig: 5 new SpecCellOther types inserted before ObjectType
- CommonJS.ts no longer reimplements the loader pipeline. require(esm)
  now calls a single C++ helper ($esmLoadSync) which sets
  vm.m_synchronousModuleLoadingDepth so the new loader's internal
  microtask reactions run inline whenever the upstream promise is
  already settled, preserving synchronous evaluation. New private
  helpers $esmNamespaceForCjs / $esmRegistryDelete /
  $esmRegistryEvaluatedKeys replace direct Loader.registry access.
- Registry inspection/mutation in C++ (BunPlugin, InspectorLifecycleAgent,
  Worker, bindings.cpp, ModuleLoader.cpp, BakeSourceProvider) moved to
  the new JSModuleLoader::registryEntry/removeEntry/clearAll/moduleMap
  accessors added in the WebKit fork.
- moduleLoaderFetch routes through fetchESMSourceCodeSync when the
  synchronous depth is non-zero so the entire dependency graph loads
  without yielding to user microtasks.
- moduleLoaderImportModule maps known import-attribute type strings to
  the corresponding ScriptFetchParameters enum instead of always using
  HostDefined.
- Bun__onFulfillAsyncModule resolves unconditionally now that the entry
  is created after fetch settles.
…ingDepth)

The recursive runInternalMicrotask path overflowed the C stack on the
500-module bench/module-loader meta.require.mjs case. The WebKit side now
diverts settled-promise reactions to a stack-local Vector and drains it
in a loop; mirror that here:

- moduleLoaderFetch gates the synchronous fetch on
  vm.m_synchronousModuleQueue being installed
- the eager provideFetch in fetchCommonJSModule allocates its own local
  queue and drains it via JSModuleLoader::drainSynchronousModuleQueue
… NodeVM link, snapshots

- moduleLoaderImportModule: parseType() returns HostDefined for any
  unknown non-empty type string under USE(BUN_JSC_ADDITIONS); routing
  that into ScriptFetchParameters::create(Type) tripped its assert. Only
  use the enum path for recognised types.
- exposeInternalModuleLoader: JSModuleLoader is now a JSCell, so
  exposing it as the public Loader global let user code dereference a
  non-object and trip JSValue::synthesizePrototype's isSymbol() debug
  assert. Stop opting in.
- NodeVMSourceTextModule::link: records built outside the loader start
  at Status::New; bump to Unlinked before CyclicModuleRecord::link.
- functionEsmLoadSync: when the load promise is still Pending because
  this module shares an SCC with a still-evaluating outer module, treat
  a non-TLA record that has reached Evaluating as done. Drop the
  separate post-Fulfilled cyclic-status gate (the body has run by then).
- InspectorLifecycleAgent: ModuleMap is hash-ordered; sort keys so
  moduleGraph.esm is stable.
- Test fixtures/snapshots: replace Loader.registry usage with
  require.cache (or skip where there's no public equivalent), update the
  console-log/BunFrontendDevServer snapshots, and switch the
  esModule.test.ts TLA self-import to a static import.

12910/json5/toml/yaml/import-query/esModule pass locally. 06946 still
fails when t2's CJS body (run inside makeModule) hits a dep the outer
graph is mid-load on; the corresponding WebKit force-advance landed in
oven-sh/WebKit c8ca4c7694c2 and brings the inner graph to evaluated, but
the top-level resultPromise remains Pending — needs follow-up.
When a dynamic import's link phase fails (unresolved binding), the
record is left at Status::New/Unlinked with no environment.
mock.module's existing-ESM path then called getModuleNamespace on it
which asserts. Gate on CyclicModuleRecord::status() >= Linked; for
records that never got that far, fall back to removing the entry so the
mock takes effect on the next import.

Also drops a leftover debug branch in functionEsmLoadSync.
- reportError.test.ts: the C++ module loader no longer puts a
  loadAndEvaluateModule frame on the JS stack, so drop it from the
  inline snapshot.
- syntax.test.ts: 'await import(import.meta.path)' is a top-level await
  on the importing module's own evaluation promise, which the
  spec-compliant loader correctly leaves unsettled. The old loader
  resolved it eagerly. Node prints an 'unsettled top-level await'
  warning and exits; we now hang the event loop. Removing the case here
  pending a decision on whether to add Node-style detection.
…g into Zig

evaluateNonVirtual can leave an exception on the scope; the next call
into Bun__VM__specifierIsEvalEntryPoint goes through String.fromJS on
the Zig side which trips the unchecked-exception verifier. Surface the
exception via RETURN_IF_EXCEPTION first.

Also update bundler_splitting NestedDynamicImportWithCSS expectations:
under BUN_FEATURE_FLAG_DISABLE_ASYNC_TRANSPILER the new loader's
spec-compliant microtask ordering matches Node, so the dynamic-import
log lines now interleave the same way.
The new C++ module loader picks JSModuleRecord::execute's sync/async
path off hasTLA(). Bun's --bytecode path constructs the JSModuleRecord
directly via Bun__analyzeTranspiledModule and never set the flag, so a
record whose body suspends on await was driven through the synchronous
path: the resumed generator was dropped and the entry promise fulfilled
immediately, leaving the standalone executable to exit 0 with no output.

Thread the top-level-await bit from the bundler/transpiler through
ModuleInfo.Flags (one spare padding bit) into JSC_JSModuleRecord__create,
which now calls result->hasTLA(hasTLA). Covers both the chunk emitter
(postProcessJSChunk) and the isolation-cache parse_result path.
The old check inspected the JS-registry entry's shared fetch promise, so
returning early was harmless: every importer awaited the same promise.
The C++ loader doesn't create the registry entry until ModuleLoadTopSettled
runs provideFetch, so two import()s of the same specifier each get their
own embedder fetch promise. Dropping the loser left it pending forever,
which is the elysia 'inline import' timeout.

provideFetch downstream is idempotent, so just resolve unconditionally.

Also catch the now-correctly-rejected import() in the #12910 fixture and
add a regression test for the concurrent-import case.
…, allowlist new symbols

bake/production.zig: loadAndEvaluateModule's promise now fulfils with the
module key (or record), not undefined. The bun.assert(val.isUndefined())
guards only fired on allow_assert builds, which is exactly the
Windows-ReleaseSafe + Linux-ASAN matrix that was crashing.

ZigGlobalObject moduleLoaderResolve: requestImportModule now re-resolves a
key that moduleLoaderImportModule already resolved through a plugin's
onResolve. If the key carries a namespace prefix that's registered in
onLoadPlugins, return it as-is instead of falling through to the
filesystem resolver (which was rejecting url://-style virtual specifiers
in plugins.test.ts).

ModuleLoader.cpp: add a FIXME for the duplicate-transpile cost of
concurrent imports.

verify-baseline-static: WebKit's bundled simdutf picked up
binary_length_from_base64 (haswell, gated by set_best()), and the IPInt
metadata removal made the i64 atomic cmpxchg validate stub spill through
vmovdqu on Windows offlineasm.
[kLink] walks the dependency graph depth-first; calling record->link()
inside that walk runs innerModuleLinking before the parent's
loadedModules() are populated, so a cycle (foo<->bar) hit
getImportedModule on a record whose map was still empty and
dereferenced end()->value (segfault 0xfff...ff8 / ASSERT Unlinked).

Populate loadedModules and bump status during the walk, then call
record->link() exactly once from instantiate() — which Module#link
already invokes after the recursive [kLink] settles. Also flip each
wired-in dependency past Status::New so the inner-linking assertion
holds for the whole subgraph.
The C++ module loader keeps fetch/module/load JSPromises live in the
module map for the lifetime of each registry entry. The old loader used
JSInternalPromise so heapStats().Promise didn't count them. They are
constant across the 50 iterations, so widen only the Promise threshold
by a fixed +10 instead of loosening the per-batch leak check.
reportExceptionInHotReloadedModuleIfNeeded used JSPromise.isHandled() as
its "already printed this rejection" sentinel. The C++ module loader
markAsHandled()s every pipeline promise before it reaches us, so the
check was always false and a throwing reload printed nothing — recover-
from-errors and the sourcemap hot tests timed out waiting for stderr.

Switch to comparing hot_reload_counter against a stored
pending_internal_promise_reported_at generation. Pointer identity isn't
usable here either: GC can hand the next reload's promise the previous
one's address. Set the generation after the initial-load uncaughtException
print so the first hot-loop tick doesn't repeat it.

Also add a regression test for the TLA-cycle dynamic-import deadlock
fixed in the WebKit fork at 18d48e326c2b (the js-sink-sourmap-fixture /
bun-server hang).
…in is pending

The C++ loader's loadAndEvaluateModule resolves through a microtask
chain, so reload() returns with pending_internal_promise still pending.
A second watcher event arriving in that window would clearAll() and
start a second chain on the same registry; both then settled and the
test saw duplicate/partial-graph Reloaded: lines (queue exhaustion in
'should recover from errors', 0%->8%).

If pending_internal_promise is still pending, just set
hot_reload_deferred and return without touching the registry. When the
in-flight promise settles, reportExceptionInHotReloadedModuleIfNeeded
re-invokes reload() once. 0/250 in 250-run probe vs 8/50 before;
matches system bun.
When the consumer disconnects mid-stream, the direct controller's
write/end/flush are swapped to onReadableStreamDirectControllerClosed.
runAsyncIterator's next write throws, the finally re-throws, and
pull()'s promise rejects. That rejection used to be a JSInternalPromise
(builtin async functions emitted those), which the unhandled-rejection
tracker ignored; now it's an ordinary Promise so the test runner sees
'Unhandled error between tests' even though all 86 tests pass.

Catch the rejection in pull() and drop it when controller.write has
already been swapped to the closed stub — the consumer is gone, there is
nothing to propagate to.
…xchg

WebKit's bundled simdutf grew binary_length_from_base64 and
utf8_length_from_utf16{be,le}_with_replacement (haswell + icelake), and
the icelake utf8_length_from_utf16 paths picked up a BMI2 shrx. All sit
behind set_best()'s CPUID dispatcher.

ipint_i64_atomic_rmw32_cmpxchg_u (Linux; Windows _validate variant
already listed) reloads the SIMD accumulator with VEX vmovdqu after
upstream ef40cabc28d2's IPInt metadata removal — gated by
Options::useWasmSIMD().
Update the four provideFetch sites that pushed a stack
Vector<SynchronousModuleTask> onto vm.m_synchronousModuleQueue to use
the new prev-linked SynchronousModuleQueue frame so VM::visitAggregate
can mark every queued JSValue.

Add a regression test that drives require(esm) through a 9-import graph
under BUN_JSC_collectContinuously=1 — fails 9/30 on the unpatched
loader, 0/150 with the WebKit fix in place.
autofix-ci Bot and others added 14 commits April 20, 2026 06:05
collectContinuously runs a full GC on every allocation; 30 parallel
processes each doing that on Windows blew past 30s with all 30 still
running. The bug reproduced at ~9/30 without the fix, so 8 attempts is
still a meaningful detector.
The C++ module loader keeps a handful of pipeline promises alive in the
registry; one of them can transitively pin a single TCPSocket through
the test's own closure chain for an extra GC cycle. The default 1s
maxWait wasn't enough on Windows (received 4, expected <= 3).
5s maxWait did not help — the extra TCPSocket is genuinely pinned, not
just slow to collect. Widen the Windows threshold by one and leave a
FIXME pointing at the module-map closure retention until we can
reproduce on a Windows box.

require-esm-gc-roots: drop collectContinuously on Windows (>60s per
subprocess on x64-baseline). Use --smol + forceRAMSize=64MB instead,
which is the exact CI condition that originally surfaced the bug on
Linux aarch64.
…isting entries

EvaluatingAsync means the record (or a dependency) has top-level await
and bindings can still be in TDZ; only treat exactly Evaluating or
Evaluated as the synchronous-SCC fast-path. When the load promise
settles Pending, only removeEntry() if this call created it — an entry
that was already there belongs to an outer import() or a prior async
load, and dropping it forces a second evaluation and a second
namespace object.

moduleLoaderResolve: skip the plugin-namespace short-circuit for
'X:' where X is a single ASCII letter (Windows drive paths), and leave a
FIXME for the remaining filter/onResolve gap.

NodeVMSourceTextModule::link: reuse requestedModules()[i] (specifier +
attributes) when wiring dependencies so the typed lookup matches.

vm.ts: FIXME for the [kLink]-before-instantiate() status mismatch.

Regression tests: require-esm-transitive-tla, require-esm-microtask-order,
plugin-namespace-drive-letter.
…M generator

reifyAllStaticProperties walks every PropertyCallback back-to-back without
an exception check between them. Several of our callbacks
(getBuiltinModulesObject, getGlobalPathsObject) call constructArray /
constructEmptyArray which open a ThrowScope at the same recursion depth as
the next callback's, so the second one tripped the exception-check
verifier. This only became reachable now that 'import "node:module"' goes
through the C++ loader's makeModule -> SyntheticSourceProvider path (the
old loader reified via the CJS shim).

The export loop right below already does constructor->get(property)
per-export, which lazy-reifies one entry at a time inside JSObject::get's
own ThrowScope and is checked immediately after, so the bulk reify was
redundant anyway.
…rted)

Points at oven-sh/WebKit#190 which reverts every IPInt change picked up
in #184. JSC_useWasmIPInt=0 makes pglite/pg-gateway pass on Windows so
the regression is in this set; this build tells us whether the revert is
sufficient before we bisect inside the seven.
…rted)

Stacks the bcd47e7f3fe7 (wasm wide arithmetic) revert on top of the 7
prior IPInt reverts. With this, every IPInt-path source file
(InPlaceInterpreter*.asm, offlineasm/x86.rb, WasmIPIntGenerator.*,
WasmFunctionIPIntMetadataGenerator.*, WasmIPIntSlowPaths.*) is
byte-identical to the previous upstream sync point d924138e38f4.

If pglite still fails on Windows x64 with this preview, the IPInt
source path is provably unchanged from last-good and the cause is
outside the wasm/llint files.
…ted)

If pglite fails on Windows here, the bug is in OptionsList.h /
WasmBBQJIT* / WasmOMGIRGenerator / WasmFunctionParser.h / X86Assembler.h
(not the IPInt asm — disassembly proved 728 handlers byte-identical).
@robobun
Copy link
Copy Markdown
Collaborator

robobun commented Apr 23, 2026

@github-actions
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants