{{ message }}
[ExecuTorch][WebGPU] Dynamic-shape integration test (allocate-at-max + per-op resize)#20582
Open
JulianCloudNTH wants to merge 9 commits into
Open
[ExecuTorch][WebGPU] Dynamic-shape integration test (allocate-at-max + per-op resize)#20582JulianCloudNTH wants to merge 9 commits into
JulianCloudNTH wants to merge 9 commits into
Conversation
This was referenced Jun 28, 2026
This PR needs a
|
Contributor
Author
|
@claude review and check for any areas or opportunities for modularization |
This was referenced Jun 30, 2026
SS-JIA
requested changes
Jul 2, 2026
SS-JIA
left a comment
Contributor
There was a problem hiding this comment.
Review automatically exported from Phabricator review in Meta.
psiddh
approved these changes
Jul 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Stack from ghstack (oldest at bottom):
End-to-end validation that one graph built at the upper-bound seq-len serves every smaller live shape, matching the torch golden.
Problem: the dynamic-resize engine (allocate-at-max buffers + per-op resize hooks + output resize) had unit-level reasoning but no single oracle proving a graph built at S=MAX runs correctly at S<MAX without reallocating buffers (which would invalidate bind groups).
Solution: a native test that builds each toy model at S=MAX and runs it at several live S, asserting the output matches a torch-computed golden and that the output EValue is resized to the live shape.
rms_norm(resize shrinks the dispatch; one reused graph across S proves buffers never move; static path unchanged).rms(rms(x))cascade,rms(x)+x(rms->add cascade),rms(x)*x(mul).linear_q4gsw(GEMM at several M),sdpa_with_kv_cache(GQA prefill at several S),embedding_q4gsw(int64 ids),apply_rotary_emb(two outputs).sigmoid(elementwise) andselect_copy(0, -1)(negative index resolved against the live leading dim each call).rms_normincl. a grow-first smallest→largest order, therms(rms(x))cascade,linear_q4gsw,embedding_q4gsw,apply_rotary_emb,sigmoid,select_copy) also runs ONE loaded graph across multiple live shapes — proving buffers never move so bind groups stay valid across every resize.Implementation:
test/ops/dynamic_shape/test_dynamic_shape_export.pyexports each toy model throughVulkanPartitionerwith a dynamic dim and writes per-S torch goldens; reuses the existing op-test helpers for quant/sdpa/embedding/rope.test/native/test_dynamic_shape.cpploads each.pte, runs each live S, and compares at the per-op tolerance (rms 1e-3, quant 5e-3, sdpa 2e-3). Reuse tests split each per-op helper into load-once + run-at-shape so a singleModuleserves the whole shape sweep.Constraints: numerics computed with torch (no hand-rolled reference); toy models stay within the 65535 1D-dispatch cap; SDPA case is skipped gracefully if
sym_size.int/copy_op coverage is incomplete (does not fail the suite).Co-authored-with: Claude Code.
@exported-using-ghexport
Differential Revision: D109906090
Differential Revision: D109906090