Skip to content
Navigation Menu
{{ message }}
-
Notifications
You must be signed in to change notification settings - Fork 923
Pull requests: flashinfer-ai/flashinfer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
tests: split test_trtllm_gen_attention.py into prefill / decode / decode-xqa shards
op: attention
#3162
opened Apr 23, 2026 by
bkryu
Collaborator
Loading…
5 tasks
CICD bug fix: ensure data/ symlinks exist before jit-cache AOT compilation
v0.6.10
release blocker label for 0.6.10
#3158
opened Apr 23, 2026 by
kahyunnam
Member
Loading…
5 tasks done
feat: DiT layer norm fusions for WAN: flashinfer.diffusion_ops
model: wan
op: misc
op: norm
run-ci
#3157
opened Apr 23, 2026 by
kahyunnam
Member
Loading…
5 tasks done
[fix] fix blackwell gdn accuracy issue
model: qwen3.5
run-ci
#3156
opened Apr 23, 2026 by
Observer007
Contributor
Loading…
4 of 5 tasks
fix(gdn): use physical SM count for SM100 persistent prefill kernel
run-ci
#3155
opened Apr 23, 2026 by
arpera
Loading…
5 tasks done
Integrate CUTLASS Small Tile N Blockscaled GEMMs/Grouped GEMMs for SM120 and SM121
op: gemm
#3152
opened Apr 23, 2026 by
depaulmillz
Contributor
Loading…
5 tasks done
fix: pre-convert scale to CUDA tensor in fused_add_rmsnorm_quant bechmark
#3150
opened Apr 22, 2026 by
knagaitsev
Loading…
5 tasks done
feat: RMSNorm + RoPE fusion for WAN: flashinfer.diffusion_ops.fused_qk_rmsnorm_rope
model: wan
op: misc
op: norm
run-ci
#3148
opened Apr 22, 2026 by
kahyunnam
Member
Loading…
5 tasks done
Fix OOB crash in intermediate_states indexing for GDN decode MTP kernel
#3145
opened Apr 22, 2026 by
wenscarl
Collaborator
Loading…
improve gdn mtp bf16 state perf for BS<=8 with LDG.128
#3143
opened Apr 22, 2026 by
ameynaik-hub
Contributor
•
Draft
5 tasks
Add TGV NVFP4 GEMM tactic to mm_fp4 cute-dsl backend (SM100/SM103)
op: gemm
#3141
opened Apr 21, 2026 by
Sinestro38
Loading…
3 tasks
fix(gemm): skip FP4 cuDNN override-shape path on SM120/SM121 (NaN regression from #2910)
op: gemm
#3140
opened Apr 21, 2026 by
Kh4L
Loading…
feat(moe-a2a): Update nvlink onesided all-to-all
op: comm
#3139
opened Apr 21, 2026 by
trevor-m
Contributor
Loading…
5 tasks
fix(mla): widen page index to int64_t to avoid 32-bit overflow
op: attention
#3136
opened Apr 21, 2026 by
Tracin
Loading…
1 of 5 tasks
fix(jit): decouple FLASHINFER_JIT_VERBOSE from debug compilation flags
#3135
opened Apr 21, 2026 by
leonardozcm
Loading…
feat: Add
row_starts and dsa_graph_safe to topk
op: misc
run-ci
#3133
opened Apr 21, 2026 by
zianglih
Contributor
Loading…
5 tasks done
feat: Enable FP8 (E4M3/E5M2) in concat_mla_k for optimize long-context prefill performance and refactor type dispatch for BF16/FP16
#3129
opened Apr 21, 2026 by
qiching
Collaborator
Loading…
4 tasks done
optimize gdn decode bf16 state kernel for mtp with caching.
model: qwen3.5
#3127
opened Apr 20, 2026 by
ameynaik-hub
Contributor
Loading…
5 tasks
docs(pod): flesh out POD-attention wrapper docstrings
#3124
opened Apr 20, 2026 by
Zlatanwic
Loading…
5 tasks done
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.
You can’t perform that action at this time.
