Skip to content
Navigation Menu
{{ message }}
-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[ROCm] [Feat] TokenSpeed MHA integration for GPTOSS
documentation
Improvements or additions to documentation
gpt-oss
Related to GPT-OSS models
rocm
Related to AMD ROCm
v1
[ROCm][Bugfix] Fix HIP fork re-init in multimodal offline examples
bug
Something isn't working
documentation
Improvements or additions to documentation
nvidia
rocm
Related to AMD ROCm
#46741
opened Jun 25, 2026 by
peizhang56
Contributor
Loading…
3 of 4 tasks
[LoRA] Add language-backbone LoRA support for MiniCPM-V 4.6
#46740
opened Jun 25, 2026 by
linitra24
Contributor
Loading…
4 tasks
[CI] Depend GPQA Eval DGX Spark job on arm64 image build
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#46736
opened Jun 25, 2026 by
mgoin
Member
Loading…
[CI] Fix failing CUDA graph capture in Triton MOE
nvidia
ready
ONLY add when PR is ready to merge/full CI is needed
#46735
opened Jun 25, 2026 by
fxmarty-amd
Contributor
Loading…
fix: patch getpass.getuser() for arbitrary-UID containers (OpenShift)
#46734
opened Jun 25, 2026 by
barryguda-1
Loading…
4 tasks
[Bugfix][Rust Frontend] Reject min_tokens above max_tokens
bug
Something isn't working
rust
#46733
opened Jun 25, 2026 by
reidliu41
Contributor
Loading…
4 tasks
[Kernel][MoE] Integrate TokenSpeed Mxfp4 MOE Kernel
#46732
opened Jun 25, 2026 by
BadrBasowid
Contributor
•
Draft
4 tasks
[ROCm][Perf][Bugfix] DSv4 indexer: use platform FP8 dtype (fnuz) for Q-quant on gfx942
bug
Something isn't working
rocm
Related to AMD ROCm
#46730
opened Jun 25, 2026 by
akii96
Contributor
Loading…
[Feat] Support thinking_token_budget in Model Runner V2
v1
#46727
opened Jun 25, 2026 by
chaunceyjiang
Collaborator
Loading…
4 tasks
[RFC] Runtime Draft Weight Update for Speculative Decoding
documentation
Improvements or additions to documentation
#46725
opened Jun 25, 2026 by
vx120
Loading…
[Attention] Occupancy-gated 3D segmented decode for multi-query (diffusion-LM canvas) over long KV
v1
#46724
opened Jun 25, 2026 by
moonlghtriver5-svg
Loading…
[Rust Frontend] Avoid redundant decode per token in incremental detokenizer
rust
#46723
opened Jun 25, 2026 by
blasrodri
Loading…
[Rust Frontend] Extract renderer fixture test utilities
ready
ONLY add when PR is ready to merge/full CI is needed
rust
#46719
opened Jun 25, 2026 by
BugenZhao
Member
Loading…
[Feat] Add runtime monitor for post-warmup TileLang compilation
#46718
opened Jun 25, 2026 by
LopezCastroRoberto
Contributor
•
Draft
tests(v1): add enforce_eager=True to test_spec_decode_logprobs
v1
#46717
opened Jun 25, 2026 by
Sriniketh24
Loading…
[CPU] Fix shared-memory all-reduce deadlock across nodes
cpu
Related to CPU backends
#46716
opened Jun 25, 2026 by
maci0
Loading…
[torch.compile] fix misleading docstrings in non-impacting config compute_hash (#39479 item 7)
#46714
opened Jun 25, 2026 by
renjingxiao
Loading…
[FS-Offloading] Batch Lookup in C
ci/build
v1
#46713
opened Jun 25, 2026 by
varun-sundar-rabindranath
Contributor
Loading…
[CI] Add registry layer cache to x86 CUDA release image builds
ci/build
nvidia
#46711
opened Jun 25, 2026 by
khluu
Member
Loading…
[Rust Frontend] Add longcat tool parser support
rust
#46709
opened Jun 25, 2026 by
yangyang-cs95
Loading…
3 tasks done
[Tool Parser] Fix make_valid_python backslash-escape edge case
tool-calling
#46708
opened Jun 25, 2026 by
sungbin1015
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-05-25.
You can’t perform that action at this time.
