Skip to content

Pull requests: HabanaAI/vllm-fork

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[PoC] Add max padding ratio to padding aware scheduler habana Issues or PRs submitted by Habana Labs
#407 opened Oct 18, 2024 by kzawora-intel Draft
Add models-tiny CI step with Llama3.2-1B habana Issues or PRs submitted by Habana Labs
#440 opened Oct 28, 2024 by kzawora-intel Draft
[WIP] Add HPU support to vLLM v1
#487 opened Nov 12, 2024 by kzawora-intel Draft
19 of 23 tasks
Add in Dockerfile.hpu.ubi external Issues or PRs submitted by external users
#602 opened Dec 9, 2024 by Xaenalt Loading…
[WIP] Add HPU support to vLLM v1 - cont. stale
#609 opened Dec 10, 2024 by kzawora-intel Loading…
21 of 23 tasks
Add exponential bucketing integration
#642 opened Dec 17, 2024 by kzawora-intel Loading…
Enable roberta embedding
#786 opened Feb 5, 2025 by yeonsily Loading…
Resolve Speculative Decode RTE
#823 opened Feb 13, 2025 by tannervoas742 Loading…
[CI] Add APC tests
#866 opened Feb 25, 2025 by kzawora-intel Loading…
Fix spec decoding warmup
#906 opened Mar 11, 2025 by yangw1234 Loading…
ProTip! What’s not been updated in a month: updated:<2025-03-01.