forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 100
Pull requests: HabanaAI/vllm-fork
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[WIP][TC][FP8] Enable dynamo to create floating point data-dependent fxgraphs
#1340
opened May 29, 2025 by
jczaja
Loading…
[deepseek_r1] Enable StaticMoE for decoding phase of static activation quant path
#1338
opened May 29, 2025 by
yangulei
Loading…
[V1] Add new block scheduling queue to select blocks with lowest id
#1329
opened May 28, 2025 by
jkaniecki
Loading…
[Gaudi][Intel] Update Dockerfile.hpu for Gaudi 1.20.1
#1316
opened May 26, 2025 by
vinayK34
Loading…
[draft] Optimize RotaryEmbedding for reuse in t.compile with dynamic shapes
#1315
opened May 26, 2025 by
anko-intel
Loading…
Split gate and up projections in LLamaMLP for models with bias
#1310
opened May 23, 2025 by
kdamaszk
Loading…
optimize transfer time. use mooncake put/get_unsafe.
#1297
opened May 22, 2025 by
jikunshang
Loading…
[deepseek_r1] add scripts for benchmark throughput and serving
#1288
opened May 21, 2025 by
yangulei
Loading…
Increase the default value of VLLM_MOE_SLICE_LENGTH to 100k
#1287
opened May 21, 2025 by
czhu15
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.