Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[CI] Run pre-commit job on self-hosted runners ci/build
#45865 opened Jun 16, 2026 by khluu Member Loading…
2 tasks
Add extensible engine core outputs for OOT backends v1
#45864 opened Jun 16, 2026 by maxdebayser Contributor Loading…
[DSv4 Perf] DSv4 flashinfer sparse index cache for metadata, 2%~4% TTFT improvement nvidia ready ONLY add when PR is ready to merge/full CI is needed v1
#45863 opened Jun 16, 2026 by yewentao256 Member Loading…
[Bugfix] DeepSeek V4: recover tool calls within reasoning content bug Something isn't working deepseek Related to DeepSeek models tool-calling
#45862 opened Jun 16, 2026 by procr1337 Loading…
3 of 4 tasks
[Bugfix] Fix hetero TP handshake assertion for GQA-replicated KV heads bug Something isn't working kv-connector v1
#45860 opened Jun 16, 2026 by kannakAWS Loading…
3 of 4 tasks
[Log] Update deepgemm log ready ONLY add when PR is ready to merge/full CI is needed
#45857 opened Jun 16, 2026 by yewentao256 Member Loading…
[Kernel] Manual TP fusion via ResidualStream (AR+RMSNorm+quant) llama Related to Llama models mistral Related to Mistral models qwen Related to Qwen models speculative-decoding
#45855 opened Jun 16, 2026 by mgoin Member Draft
[ROCm][Quant] Minimax-M3: Enable fp8_per_channel for bf16 weights on mi300x rocm Related to AMD ROCm
#45854 opened Jun 16, 2026 by hongxiayang Collaborator Loading…
4 tasks
[KV Offload] Use background thread for mmap / cpu_tensors pinning v1
#45850 opened Jun 16, 2026 by Acaciasama Loading…
3 of 4 tasks
[BUG] fix hidden states nan for hybrid attention models bug Something isn't working kv-connector ready ONLY add when PR is ready to merge/full CI is needed v1
#45849 opened Jun 16, 2026 by shanjiaz Contributor Loading…
3 of 4 tasks
Skip stop string matching inside <think> blocks frontend v1
#45846 opened Jun 16, 2026 by elvircrn Contributor Draft
4 tasks
[v1][kvcache] Honor prefix-cache retention interval for Mamba/linear attention ready ONLY add when PR is ready to merge/full CI is needed v1
#45845 opened Jun 16, 2026 by Dao007forever Contributor Loading…
[Bugfix] Fix CPU split-KV scratchpad sizing bug Something isn't working cpu Related to CPU backends
#45844 opened Jun 16, 2026 by gausah01 Contributor Loading…
[CI][NIXL] Pin NIXL to 1.2.0 ci/build kv-connector ready ONLY add when PR is ready to merge/full CI is needed
#45843 opened Jun 16, 2026 by itayalroy Contributor Loading…
Bump the minor-update group across 1 directory with 150 updates ci/build dependencies Pull requests that update a dependency file nvidia rocm Related to AMD ROCm
#45842 opened Jun 16, 2026 by dependabot Bot Loading…
add attention sinks to flex attention documentation Improvements or additions to documentation v1
#45841 opened Jun 16, 2026 by liangel-02 Contributor Loading…
[Frontend] Support additional sampling parameters for translation API frontend
#45839 opened Jun 16, 2026 by guan404ming Contributor Loading…
3 of 4 tasks
Minimax m3 gfx950 mxfp4
#45838 opened Jun 16, 2026 by dllehr-amd Collaborator Draft
4 tasks
[Bugfix] Restore unloaded FP8 scale params during layerwise reload bug Something isn't working
#45835 opened Jun 16, 2026 by aoshen02 Collaborator Draft
3 tasks
[RFC][Core][Model] Voxtral realtime: unbounded-duration streaming via RoPE re-anchoring (experimental, default-off) documentation Improvements or additions to documentation mistral Related to Mistral models multi-modality Related to multi-modality (#4194) performance Performance-related issues v1
#45833 opened Jun 16, 2026 by damienlaine Draft
[Bugfix][Gemma4] Fix parsing when thinking is disabled bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed tool-calling
#45832 opened Jun 16, 2026 by m4r1k Contributor Loading…
ProTip! Exclude everything labeled bug with -label:bug.