Skip to content

Commit

Permalink
Merge pull request #160 from runpod-workers/up-0.7.2
Browse files Browse the repository at this point in the history
update vllm
  • Loading branch information
pandyamarut authored Feb 11, 2025
2 parents 6fc7704 + c9791f1 commit d7e9c49
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 5 deletions.
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN --mount=type=cache,target=/root/.cache/pip \
python3 -m pip install --upgrade -r /requirements.txt

# Install vLLM (switching back to pip installs since issues that required building fork are fixed and space optimization is not as important since caching) and FlashInfer
RUN python3 -m pip install vllm==0.7.0 && \
RUN python3 -m pip install vllm==0.7.2 && \
python3 -m pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.3

# Setup for Option 2: Building the Image with the Model included
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ Deploy OpenAI-Compatible Blazing-Fast LLM Endpoints powered by the [vLLM](https:
### 1. UI for Deploying vLLM Worker on RunPod console:
![Demo of Deploying vLLM Worker on RunPod console with new UI](media/ui_demo.gif)

### 2. Worker vLLM `v1.9.0` with vLLM `0.7.0` now available under `stable` tags
### 2. Worker vLLM `v2.0.0` with vLLM `0.7.2` now available under `stable` tags

Update v1.9.0 is now available, use the image tag `runpod/worker-v1-vllm:v1.9.0stable-cuda12.1.0`.
Update v2.0.0 is now available, use the image tag `runpod/worker-v1-vllm:v2.0.0stable-cuda12.1.0`.

### 3. OpenAI-Compatible [Embedding Worker](https://github.com/runpod-workers/worker-infinity-embedding) Released
Deploy your own OpenAI-compatible Serverless Endpoint on RunPod with multiple embedding models and fast inference for RAG and more!
Expand Down Expand Up @@ -82,7 +82,7 @@ Below is a summary of the available RunPod Worker images, categorized by image s

| CUDA Version | Stable Image Tag | Development Image Tag | Note |
|--------------|-----------------------------------|-----------------------------------|----------------------------------------------------------------------|
| 12.1.0 | `runpod/worker-v1-vllm:v1.9.0stable-cuda12.1.0` | `runpod/worker-v1-vllm:v1.9.0dev-cuda12.1.0` | When creating an Endpoint, select CUDA Version 12.3, 12.2 and 12.1 in the filter. |
| 12.1.0 | `runpod/worker-v1-vllm:v2.0.0stable-cuda12.1.0` | `runpod/worker-v1-vllm:v2.0.0dev-cuda12.1.0` | When creating an Endpoint, select CUDA Version 12.3, 12.2 and 12.1 in the filter. |



Expand Down
2 changes: 1 addition & 1 deletion docker-bake.hcl
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ variable "REPOSITORY" {
}

variable "BASE_IMAGE_VERSION" {
default = "stable"
default = "v2.0.0stable"
}

group "all" {
Expand Down

0 comments on commit d7e9c49

Please sign in to comment.