forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 81
Pull requests: HabanaAI/vllm-fork
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[PoC] Add max padding ratio to padding aware scheduler
habana
Issues or PRs submitted by Habana Labs
#407
opened Oct 18, 2024 by
kzawora-intel
•
Draft
Add models-tiny CI step with Llama3.2-1B
habana
Issues or PRs submitted by Habana Labs
#440
opened Oct 28, 2024 by
kzawora-intel
•
Draft
Add in Dockerfile.hpu.ubi
external
Issues or PRs submitted by external users
#602
opened Dec 9, 2024 by
Xaenalt
Loading…
[WIP] Add HPU support to vLLM v1 - cont.
stale
#609
opened Dec 10, 2024 by
kzawora-intel
Loading…
21 of 23 tasks
[DO NOT MERGE][PoC] Mark dynamic shapes in torch.compile mode
#755
opened Jan 29, 2025 by
kzawora-intel
•
Draft
[DEEPSEEK_V3/R1] includes features of fp8 dequant, MLA, Expert parallelism
#792
opened Feb 6, 2025 by
xuechendi
Loading…
Update documentation to reflect current bucket defaults
#817
opened Feb 12, 2025 by
nngokhale
Loading…
enable multi-modal embedding for TIGER-Lab/VLM2Vec-Full T+I on HPU
#854
opened Feb 20, 2025 by
libinta
Loading…
Update requirements-hpu.txt for open telemetry tracing support
#857
opened Feb 21, 2025 by
louie-tsai
Loading…
Synchronize vLLM flags to support cross-node inference
#897
opened Mar 7, 2025 by
IT-Forrest
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-03-01.