WIP: Renderer Async Compilation by RenaudRohlinger · Pull Request #33766 · mrdoob/three.js · GitHub
Skip to content

WIP: Renderer Async Compilation#33766

Draft
RenaudRohlinger wants to merge 3 commits into
mrdoob:devfrom
RenaudRohlinger:async-compilation-v2
Draft

WIP: Renderer Async Compilation#33766
RenaudRohlinger wants to merge 3 commits into
mrdoob:devfrom
RenaudRohlinger:async-compilation-v2

Conversation

@RenaudRohlinger

@RenaudRohlinger RenaudRohlinger commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

WIP -- Working but needs reviews and more tests.

I’m aiming this around r187 for the Three.js Conference in Paris. I’m thinking of doing my talk about Three.js performance and best practices and this work would be part of it.


This is something I've been thinking about for a long time — a truly non-blocking three.js renderer, enabled with a single boolean, and it relates to a point @sunag raised a while back as well in #33042 (comment).

The observation behind it: a material depends on two heavyweight GPU resources, and today they follow opposite policies. Its textures are already non-blocking — a mesh draws right away and the image data pops in when ready. Its shader/pipeline, however, is still built synchronously inside the frame, and the frame waits. This PR gives the pipeline the same policy the material's textures already have:

const renderer = new THREE.WebGPURenderer( { asyncCompilation: true } );

With the flag set, the renderer never builds TSL/WGSL and never creates GPU pipelines on the render path:

  • New materials compile in the background and pop in when ready — exactly like textures.
  • Changed materials keep drawing their last compiled state until the replacement swaps in atomically at the start of a frame (keep-last). No flicker, no skipped frames, no stall.

The default path (asyncCompilation: false) is byte-for-byte untouched.

Before / after

Today — compilation happens inside the frame

flowchart LR
    A["render()"] --> B{"material new<br>or changed?"}
    B -- no --> F["draw"]
    B -- yes --> C["dispose + recreate<br>render object"]
    C --> D["TSL build + WGSL codegen<br>main thread, unbounded"]
    D --> E["createRenderPipeline()<br>synchronous driver compile"]
    E --> F
    classDef blocking fill:#7f1d1d,stroke:#ef4444,color:#fff
    class C,D,E blocking
Loading

The whole frame waits on the red boxes: worst frames of 60–130 ms with warm driver caches, and multiple seconds with cold ones.

With asyncCompilation: true — the frame never waits

flowchart LR
    subgraph RP["render path — never compiles"]
        A["render()"] --> B{"structural<br>change?"}
        B -- no --> F["draw"]
        B -- yes --> R["capture draw-state snapshot<br>+ request replacement"]
        R --> K["keep drawing<br>last compiled state"]
        K --> F
    end
    subgraph BG["background driver — idle time"]
        Q["priority queue"] --> S["NodeBuilder.buildStep()<br>sliced, ~2 ms budget"]
        S --> P["createRenderPipelineAsync()<br>browser-internal threads"]
        P --> M["promotion queue"]
    end
    R -. enqueue .-> Q
    M -. "next frame, at the safe point:<br>atomic swap" .-> A
    classDef safe fill:#14532d,stroke:#22c55e,color:#fff
    class K,F safe
Loading

Lifecycle of a material edit

sequenceDiagram
    participant App
    participant Main as Main thread (render path)
    participant Driver as Background driver (idle slices)
    participant GPU as GPU process

    App->>Main: material.needsUpdate = true
    Main->>Main: frame N — draws OLD pipeline
    Main-)Driver: request replacement (draw-state snapshot + request-time lights)
    Driver->>Driver: buildStep() slices between frames (~2 ms each)
    Main->>Main: frame N+1 … — still drawing OLD pipeline
    Driver-)GPU: createRenderPipelineAsync()
    GPU--)Driver: pipeline ready
    Driver-)Main: queue promotion
    Main->>Main: frame N+k — atomic swap at frame start, draws NEW pipeline
Loading

How it works

The design decision that keeps this PR small: the background candidate is just a second RenderObject. It already owns every field a compiled state needs (_nodeBuilderState, _bindings, pipeline, attributes), it rides the classic cache/refcount lifecycle end to end (nodeBuilderCache, pipeline/program usedTimes, Bindings), and promotion is a chain-map pointer swap — the replaced render object releases its resources through the regular dispose() path. There is no parallel resource-ownership system anywhere.

  • RenderObjects.get() detects the structural change, captures an immutable draw-state snapshot (pass-correct side, blending, stencil, wireframe, … — taken during traversal) and keeps the old render object drawing.
  • The new AsyncCompilation driver — the only new file — builds one node graph at a time, sliced with the new NodeBuilder.buildStep( deadline ) over requestIdleCallback (with a setTimeout backstop; hidden tabs keep compiling).
  • Pipeline creation is fire-and-collect through the existing createRenderPipelineAsync path; completions queue a promotion. The WebGL fallback uses the same flow via KHR_parallel_shader_compile.
  • Promotions apply only at the entry of a top-level render() — nested renders (shadow maps, transmission, bundles) never observe a mid-frame swap. Render bundles containing a swapped object re-record automatically.
  • Structural mutations that don't bump material.version (e.g. blend-mode values, transparent) are caught by a change detector primed from the compiled snapshot.
  • Render lists classify changed materials from their compiled snapshot, so list bucket, pass membership and pipeline always agree during a transition.
  • A broken build logs once (TSL: …), enters a bounded failure cache and keeps the last compiled state until the structural key changes — broken shaders are never recompiled every frame.
  • Rapid edits A→B→C drop superseded candidates; only the latest promotes.
  • Geometry vertex-layout changes fall back to the classic dispose+recreate (keep-last across a layout change would bind invalid buffers).
  • compileAsync() routes through the same driver in both modes; prewarmed objects draw on their first frame. On a cold start objects appear as they become ready — compileAsync() is the remedy when pop-in is unacceptable.

API surface

  • asyncCompilation constructor option (default false).
  • object.compilePriority / material.compilePriority — optional hint: > 0 compiles ahead of all automatic replacement work, < 0 after it.
  • renderer.onBackgroundWorkReady — notification for on-demand renderers, fired when finished background work is ready to display.
  • renderer.info.asyncCompilation{ queued, pipelines, promotions, failed, mainThreadTime }.

Scope

  • With the flag off, behavior is unchanged. In async mode, an unchanged scene adds zero per-frame allocation or key computation — the detector compare is the same one needsRenderUpdate() already performs.
  • No worker compilation in this PR — buildStep( deadline ) is exactly the seam a worker compiler plugs into later, as its own PR.

TODO

  • Compute pipelines are out of scope here and still compile synchronously (device.createComputePipeline on the dispatch path) — async compilation covers render pipelines only. Follow-up: revive WebGPURenderer: Introduce compileComputeAsync() #32551 (compileComputeAsync()) on top of this PR's NodeBuilder.buildStep() slicing and async pipeline creation path.

This contribution is funded by Renaud Rohlinger

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown

@RenaudRohlinger RenaudRohlinger force-pushed the async-compilation-v2 branch 3 times, most recently from f76ae49 to ab6b8c5 Compare June 11, 2026 11:59
RenaudRohlinger and others added 3 commits June 16, 2026 21:16
build() runs buildStep( Infinity ) and buildAsync() slices buildStep()
between flow-node units, so all three entry points share one build
sequence. No behavior change.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…PipelineReady()

Deduplicates the program get-or-create blocks, makes the render pipeline
release path reusable and splits pipeline readiness from render object
readiness. Pure refactor, no behavior change.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…d keep-last

With `new WebGPURenderer( { asyncCompilation: true } )` the renderer never
builds TSL/WGSL or creates GPU pipelines on the render path. New materials
compile in the background and pop in like textures; changed materials keep
drawing their last compiled state until the replacement swaps in atomically
at the start of a frame.

The background candidate is a second RenderObject: it rides the classic
cache/refcount lifecycle end to end and promotion is a chain-map pointer
swap, so no resource-ownership transfer exists anywhere. The new
AsyncCompilation driver slices node builds over idle time with
NodeBuilder.buildStep(), creates pipelines with the asynchronous backend
path and applies promotions only at top-level render safe points.

- structural changes that do not bump the material version (e.g. blend
  mode values) are caught by a backend change detector primed from the
  compiled snapshot
- render lists classify changed materials from their compiled snapshot so
  list bucket, pass membership and pipeline always agree
- a broken build logs once, enters a bounded failure cache and keeps the
  last compiled state until the structural key changes
- compileAsync() routes through the driver in both modes; objects drawn
  for the first time after it resolves render on their first frame
- compilePriority on a 3D object or material biases the compilation order
- renderer.onBackgroundWorkReady notifies on-demand applications

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant