iframe-proxy

ghost · 2026-06-18T21:13:29Z

Manual co-merge of the stacked ghstack PRs #20339, #20357, and #20358 — all reviewed and accepted, but not picked up by auto-merge. Combining them into one PR so they can be merged directly.

This contains three commits:

Op-test codegen framework ([ExecuTorch][WebGPU] Op-test codegen framework (cases.py -> generated .pte+golden -> gtest driver) #20339) — a declarative cases.py → generated .pte + golden → gtest driver for WebGPU op tests, mirroring the Vulkan op_tests setup. Each op declares its shapes/configs in
cases.py; the generator exports a .pte per case and compares the on-GPU result against an fp64 torch golden on Dawn.
Consolidate landed-op tests into the framework ([ExecuTorch][WebGPU] Consolidate landed-op tests into the cases.py op-test framework #20357) — migrates the existing add and rms_norm tests off the C++ monolith into the cases.py framework. update_cache/sdpa are kept standalone (stateful —
they don't fit the single-forward model).
Add mul op with full broadcast ([ExecuTorch][WebGPU] Add mul op with full broadcast (aten.mul.Tensor) #20358) — aten.mul.Tensor for the WebGPU delegate with full broadcasting, plus the shared TensorMeta broadcast-uniform infrastructure. On the Llama critical path (SwiGLU).

… .pte+golden -> gtest driver) Pull Request resolved: #20339 A manifest-driven op-test framework for the WebGPU backend, mirroring Vulkan's `op_tests/cases.py` (declarative per-op suites) but with a torch-computed golden loaded in C++, since the native test binary has no ATen. An op fits when it is stateless and expressible as one `module(inputs) -> golden` forward; stateful KV-cache ops (`sdpa`, `update_cache`) stay hand-written. Lands first as the shared foundation for the following op test-diffs — adding an op's test becomes one `cases.py` entry. Composition: - `test_suite.py` — schema (`WebGPUTestSuite`/`Case`/`InputSpec`) + a `register_op_test` decorator; per-case `required`/`heavy`/`golden_fn`, per-suite `golden_dtype`. - `cases.py` — the declarative suites; registers the landed `add` + `rms_norm`; later ops append one entry each. - `generate_op_tests.py` — per case: export via `VulkanPartitioner` to `.pte`, compute the fp64 torch golden (dual-oracle gate), serialize inputs+golden as fp32, emit `manifest.json`. - `op_test_driver.cpp` + `driver_util.{h,cpp}` — generic gtest driver: one test per manifest entry, runs forward on-device, abs/rel + shape + reconciliation checks. - `CMakeLists`/`ci.sh` — `webgpu_op_test` + device-free `webgpu_op_test_util_test`, wired into the Dawn(Tint)+SwiftShader CI. ghstack-source-id: 394712836 @exported-using-ghexport Differential Revision: [D108816389](https://our.internmc.facebook.com/intern/diff/D108816389/)

…-test framework Pull Request resolved: #20357 `add` and `rms_norm` already have declarative suites in the `cases.py` framework, so their standalone tests are redundant. Remove them, leaving the framework as the single home for these stateless single-forward ops (mirroring Vulkan). `update_cache` and `sdpa` stay hand-written — stateful KV-cache replay can't be a single `module -> golden` forward. Removed: - `test_single_add`/`test_chained_add` + their `main()` plumbing from `test_webgpu_native.cpp`. - the standalone `test/native/test_rms_norm.cpp` binary + its CMake target. - the add/rms_norm export+run wiring in `test_webgpu_native_ci.sh` + `test_build_webgpu.sh`. - `webgpu_native_test` re-gated on the executorch wheel being importable (was the add `.pte`); it still hosts the quantized_linear/SDPA/update_cache/symint sweeps. ghstack-source-id: 394625048 @exported-using-ghexport Differential Revision: [D108821384](https://our.internmc.facebook.com/intern/diff/D108821384/)

Pull Request resolved: #20358 Adds `aten.mul.Tensor` to the WebGPU delegate with full PyTorch broadcast, plus the shared `runtime/ops/TensorMeta.h` per-tensor uniform that broadcast ops reuse. Mul is on the Llama critical path — `F.silu` decomposes to `sigmoid` + `mul`, and SwiGLU multiplies two same-shape activations (the fast path). Composition (single dispatch): - `TensorMeta.h` (NEW) — 48-byte std140 `{ndim, numel, sizes[4], strides[4]}` UBO mirroring Vulkan's per-tensor `BufferMetadata`; `fill_tensor_meta_broadcast` right-aligns operand dims (rank>4 throws); `static_assert(sizeof==48)`. - `mul/BinaryOp.cpp` — builds 3 `TensorMeta` UBOs (out/in1/in2 at bindings 3/4/5), guards fp32 + rank≤4, 1D-dispatches over `compute_1d_workgroup_count(numel)`, releases all uniforms after the bind group. - `mul/binary_mul.wgsl` — same-shape fast path + a broadcast path (delinearize output index, clamp each input coord per-dim to size-1, relinearize on input strides). - `WebGPUUtils.h` — adds the shared `utils::make_uniform` helper (first use). ghstack-source-id: 394848336 @exported-using-ghexport Differential Revision: [D108793167](https://our.internmc.facebook.com/intern/diff/D108793167/)

pytorch-bot · 2026-06-18T21:13:33Z

github-actions · 2026-06-18T21:14:31Z

JCNTH added 3 commits June 18, 2026 09:07

ghost requested review from kirklandsign and larryliu0820 as code owners June 18, 2026 21:13

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 18, 2026

Merge branch 'main' into webgpu-misc-ops-manual-merge

18f639d

ghost requested a review from SS-JIA June 18, 2026 21:13

ghost temporarily deployed to cadence June 18, 2026 21:13 — with GitHub Actions Inactive

SS-JIA approved these changes Jun 18, 2026

View reviewed changes

SS-JIA merged commit 0e65ba6 into main Jun 18, 2026
170 of 176 checks passed

SS-JIA deleted the webgpu-misc-ops-manual-merge branch June 18, 2026 21:15

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WebGPU op-test framework + mul op manual merge#20389

WebGPU op-test framework + mul op manual merge#20389
SS-JIA merged 4 commits into
mainfrom
webgpu-misc-ops-manual-merge

ghost commented Jun 18, 2026

Uh oh!

pytorch-bot Bot commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

ghost commented Jun 18, 2026

Uh oh!

pytorch-bot Bot commented Jun 18, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20389

Uh oh!

github-actions Bot commented Jun 18, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

This PR needs a `release notes:` label