-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[CI] Run pre-commit job on self-hosted runners
ci/build
#45865
opened Jun 16, 2026 by
khluu
Member
Loading…
2 tasks
Add extensible engine core outputs for OOT backends
v1
#45864
opened Jun 16, 2026 by
maxdebayser
Contributor
Loading…
[DSv4 Perf] DSv4 flashinfer sparse index cache for metadata, 2%~4% TTFT improvement
nvidia
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#45863
opened Jun 16, 2026 by
yewentao256
Member
Loading…
[Bugfix] DeepSeek V4: recover tool calls within reasoning content
bug
Something isn't working
deepseek
Related to DeepSeek models
tool-calling
#45862
opened Jun 16, 2026 by
procr1337
Loading…
3 of 4 tasks
[Bugfix] Fix hetero TP handshake assertion for GQA-replicated KV heads
bug
Something isn't working
kv-connector
v1
#45860
opened Jun 16, 2026 by
kannakAWS
Loading…
3 of 4 tasks
[Log] Update deepgemm log
ready
ONLY add when PR is ready to merge/full CI is needed
#45857
opened Jun 16, 2026 by
yewentao256
Member
Loading…
[Kernel] Manual TP fusion via ResidualStream (AR+RMSNorm+quant)
llama
Related to Llama models
mistral
Related to Mistral models
qwen
Related to Qwen models
speculative-decoding
[ROCm][Quant] Minimax-M3: Enable fp8_per_channel for bf16 weights on mi300x
rocm
Related to AMD ROCm
#45854
opened Jun 16, 2026 by
hongxiayang
Collaborator
Loading…
4 tasks
[Bugfix][Gemma4] Pre-initialise streaming reasoning state when prompt ends inside an open Something isn't working
tool-calling
<|channel> (fixes #45834)
bug
#45852
opened Jun 16, 2026 by
nikhilesh-csa
Loading…
3 of 4 tasks
[KV Offload] Use background thread for mmap / cpu_tensors pinning
v1
#45850
opened Jun 16, 2026 by
Acaciasama
Loading…
3 of 4 tasks
[BUG] fix hidden states nan for hybrid attention models
bug
Something isn't working
kv-connector
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#45849
opened Jun 16, 2026 by
shanjiaz
Contributor
Loading…
3 of 4 tasks
[rust] EngineCoreSamplingParams: add serde defaults for omit_defaults fields
rust
#45848
opened Jun 16, 2026 by
wseaton
Contributor
Loading…
[v1][kvcache] Honor prefix-cache retention interval for Mamba/linear attention
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#45845
opened Jun 16, 2026 by
Dao007forever
Contributor
Loading…
[Bugfix] Fix CPU split-KV scratchpad sizing
bug
Something isn't working
cpu
Related to CPU backends
#45844
opened Jun 16, 2026 by
gausah01
Contributor
Loading…
[CI][NIXL] Pin NIXL to 1.2.0
ci/build
kv-connector
ready
ONLY add when PR is ready to merge/full CI is needed
#45843
opened Jun 16, 2026 by
itayalroy
Contributor
Loading…
Bump the minor-update group across 1 directory with 150 updates
ci/build
dependencies
Pull requests that update a dependency file
nvidia
rocm
Related to AMD ROCm
#45842
opened Jun 16, 2026 by
dependabot
Bot
Loading…
add attention sinks to flex attention
documentation
Improvements or additions to documentation
v1
#45841
opened Jun 16, 2026 by
liangel-02
Contributor
Loading…
[Perf] Skip/shrink all_token_ids copy in scheduler for non-async and V2 runner
v1
#45840
opened Jun 16, 2026 by
amanchugh89
Loading…
[Frontend] Support additional sampling parameters for translation API
frontend
#45839
opened Jun 16, 2026 by
guan404ming
Contributor
Loading…
3 of 4 tasks
[NVFP4 MoE/DSV4] Marlin: wire SwiGLU clamp + allow it for clamped models on non-Blackwell
#45836
opened Jun 16, 2026 by
mikekg
Contributor
Loading…
[RFC][Core][Model] Voxtral realtime: unbounded-duration streaming via RoPE re-anchoring (experimental, default-off)
documentation
Improvements or additions to documentation
mistral
Related to Mistral models
multi-modality
Related to multi-modality (#4194)
performance
Performance-related issues
v1
#45833
opened Jun 16, 2026 by
damienlaine
•
Draft
[Bugfix][Gemma4] Fix parsing when thinking is disabled
bug
Something isn't working
ready
ONLY add when PR is ready to merge/full CI is needed
tool-calling
#45832
opened Jun 16, 2026 by
m4r1k
Contributor
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.