Skip to content

Pull requests: HabanaAI/vllm-fork

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Fix multi-node for HPU
#981 opened Mar 28, 2025 by afierka-intel Draft
enable VLLM_MLA_PERFORM_MATRIX_ABSORPTION=0
#980 opened Mar 28, 2025 by yangulei Loading…
add torch profiler for the LLM engine
#979 opened Mar 28, 2025 by yangulei Loading…
remove slicing in static-quant + dynamic-MoE mode
#978 opened Mar 28, 2025 by yangulei Loading…
Deepseek r1 fp8matmul
#977 opened Mar 27, 2025 by xuechendi Draft
Fix APC on v0 habana Issues or PRs submitted by Habana Labs
#975 opened Mar 27, 2025 by adobrzyn Loading…
Enable torchrun on Gaudi
#974 opened Mar 27, 2025 by czhu15 Loading…
enable fp32 softmax in flat_pa_mla
#972 opened Mar 27, 2025 by yangulei Loading…
Optimize rope if head_size == rotary_dim
#968 opened Mar 26, 2025 by kdamaszk Loading…
Move torch.compile to HPUModelRunnerBase
#966 opened Mar 26, 2025 by anko-intel Loading…
[SW-222977] Fix for test_lora_manager_hpu.py
#965 opened Mar 25, 2025 by rsshaik1 Loading…
Update linear.py
#964 opened Mar 25, 2025 by michalkuligowski Draft
Update layers.py
#957 opened Mar 25, 2025 by michalkuligowski Draft
Rebase - 2025.03.24
#947 opened Mar 24, 2025 by kzawora-intel Loading…
Update hpu_worker.py
#943 opened Mar 21, 2025 by michalkuligowski Loading…
merged_prefill+ - initial cleanup
#942 opened Mar 21, 2025 by madamczykhabana Loading…
add ScaleToHwAligned for loading fp8 vllm model
#941 opened Mar 21, 2025 by changwangss Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.