Pull requests · vllm-project/vllm · GitHub
Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[ROCm] [Feat] TokenSpeed MHA integration for GPTOSS documentation Improvements or additions to documentation gpt-oss Related to GPT-OSS models rocm Related to AMD ROCm v1
#46742 opened Jun 25, 2026 by tjtanaa Member Draft
4 tasks
[ROCm][Bugfix] Fix HIP fork re-init in multimodal offline examples bug Something isn't working documentation Improvements or additions to documentation nvidia rocm Related to AMD ROCm
#46741 opened Jun 25, 2026 by peizhang56 Contributor Loading…
3 of 4 tasks
[LoRA] Add language-backbone LoRA support for MiniCPM-V 4.6
#46740 opened Jun 25, 2026 by linitra24 Contributor Loading…
4 tasks
[CPU][BugFix] Multiple fixes to w4a8_int8 CPU MoE path bug Something isn't working ci/build cpu Related to CPU backends
#46739 opened Jun 25, 2026 by fadara01 Contributor Loading…
1 task
[CI] Depend GPQA Eval DGX Spark job on arm64 image build ci/build ready ONLY add when PR is ready to merge/full CI is needed
#46736 opened Jun 25, 2026 by mgoin Member Loading…
[CI] Fix failing CUDA graph capture in Triton MOE nvidia ready ONLY add when PR is ready to merge/full CI is needed
#46735 opened Jun 25, 2026 by fxmarty-amd Contributor Loading…
[Bugfix][Rust Frontend] Reject min_tokens above max_tokens bug Something isn't working rust
#46733 opened Jun 25, 2026 by reidliu41 Contributor Loading…
4 tasks
[Kernel][MoE] Integrate TokenSpeed Mxfp4 MOE Kernel
#46732 opened Jun 25, 2026 by BadrBasowid Contributor Draft
4 tasks
[ROCm][Perf][Bugfix] DSv4 indexer: use platform FP8 dtype (fnuz) for Q-quant on gfx942 bug Something isn't working rocm Related to AMD ROCm
#46730 opened Jun 25, 2026 by akii96 Contributor Loading…
[Bugfix] Avoid MLA decode shared memory OOR bug Something isn't working v1
#46728 opened Jun 25, 2026 by pxljs Loading…
[Feat] Support thinking_token_budget in Model Runner V2 v1
#46727 opened Jun 25, 2026 by chaunceyjiang Collaborator Loading…
4 tasks
[RFC] Runtime Draft Weight Update for Speculative Decoding documentation Improvements or additions to documentation
#46725 opened Jun 25, 2026 by vx120 Loading…
[ROCm][DSV4] B-preshuffle the attention fp8 projections rocm Related to AMD ROCm
#46720 opened Jun 25, 2026 by cagrikymk Draft
[Rust Frontend] Extract renderer fixture test utilities ready ONLY add when PR is ready to merge/full CI is needed rust
#46719 opened Jun 25, 2026 by BugenZhao Member Loading…
[CPU] Fix shared-memory all-reduce deadlock across nodes cpu Related to CPU backends
#46716 opened Jun 25, 2026 by maci0 Loading…
[FS-Offloading] Batch Lookup in C ci/build v1
#46713 opened Jun 25, 2026 by varun-sundar-rabindranath Contributor Loading…
[Rust Frontend] Add longcat tool parser support rust
#46709 opened Jun 25, 2026 by yangyang-cs95 Loading…
3 tasks done
ProTip! What’s not been updated in a month: updated:<2026-05-25.