HabanaAI / vllm-fork Public

forked from vllm-project/vllm

Notifications You must be signed in to change notification settings
Fork 100
Star 75

Code
Issues 11
Pull requests 83
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: HabanaAI/vllm-fork

Labels 17 Milestones 0

New pull request New

83 Open 1,178 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

consolidate pd/dp scripts

#1346 opened May 31, 2025 by hlin99

Loading…

by default disable contiguous_pa on Gaudi2.

#1345 opened May 30, 2025 by ccrhx4

Loading…

[WIP] Enable interleaved sliding_window for gemma3

#1344 opened May 30, 2025 by jiminha • Draft

[draft] merged_prefill for V1

#1342 opened May 29, 2025 by madamczyk-intel • Draft

[WIP][TC][FP8] Enable dynamo to create floating point data-dependent fxgraphs

#1340 opened May 29, 2025 by jczaja

Loading…

Revise DeepSeek-R1 README and update start scripts

#1339 opened May 29, 2025 by taotod

Loading…

[deepseek_r1] Enable StaticMoE for decoding phase of static activation quant path

#1338 opened May 29, 2025 by yangulei

Loading…

fix requirements/hpu.txt for hpu extension

#1336 opened May 29, 2025 by ranzhejiang

Loading…

Fix prefill warm up issue

#1335 opened May 29, 2025 by yeonsily • Draft

Upgrade to HPU docker 1.21.0 and update run_cluster.sh

#1331 opened May 28, 2025 by tvoas

Loading…

[V1] Add new block scheduling queue to select blocks with lowest id

#1329 opened May 28, 2025 by jkaniecki

Loading…

Fix vllm crash when running with lm-eval

#1321 opened May 27, 2025 by ccrhx4

Loading…

Add Flag to speed up Qwen3 fp8 warmup issue

#1319 opened May 27, 2025 by Yanli2190

Loading…

[Gaudi][Intel] Update Dockerfile.hpu for Gaudi 1.20.1

#1316 opened May 26, 2025 by vinayK34

Loading…

[draft] Optimize RotaryEmbedding for reuse in t.compile with dynamic shapes

#1315 opened May 26, 2025 by anko-intel

Loading…

[Torch compile] Torch compilation on Sampler

#1314 opened May 26, 2025 by jczaja

Loading…

Split gate and up projections in LLamaMLP for models with bias

#1310 opened May 23, 2025 by kdamaszk

Loading…

Enabled MoE for both BF16 and INC based FP8.

#1309 opened May 23, 2025 by gyou2021

Loading…

parallel compile for fast warm up

#1304 opened May 22, 2025 by inkcherry

Loading…

optimize transfer time. use mooncake put/get_unsafe.

#1297 opened May 22, 2025 by jikunshang

Loading…

Qwen2.5 Omni

#1296 opened May 22, 2025 by wenbinc-Bin

Loading…

[PoC][CI] Introduce test templates

#1293 opened May 21, 2025 by kzawora-intel • Draft

Add torch.compile tests into test_config.yaml

#1289 opened May 21, 2025 by kzawora-intel

Loading…

[deepseek_r1] add scripts for benchmark throughput and serving

#1288 opened May 21, 2025 by yangulei

Loading…

Increase the default value of VLLM_MOE_SLICE_LENGTH to 100k

#1287 opened May 21, 2025 by czhu15

Loading…

Previous 1 2 3 4 Next

Previous Next

ProTip! Adding no:label will show everything without a label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!