Skip to content

Commit b7d2fb8

Browse files
HarshaRamayanamtylertitsworthtylertitsworth
authored
Adds Huggingface GenAI container build specs into pytorch/Dockerfile and pytorch/docker-compose.yaml (#146)
Signed-off-by: tylertitsworth <tyler.titsworth@intel.com> Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com> Co-authored-by: tylertitsworth <tyler.titsworth@intel.com> Co-authored-by: Tyler Titsworth <titswortht@gmail.com>
1 parent 3ecd2cd commit b7d2fb8

File tree

5 files changed

+61
-0
lines changed

5 files changed

+61
-0
lines changed

pytorch/Dockerfile

+7
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,13 @@ RUN wget -q --no-check-certificate https://raw.githubusercontent.com/oneapi-src
135135

136136
ENTRYPOINT ["/usr/local/bin/dockerd-entrypoint.sh"]
137137

138+
FROM multinode AS hf-genai
139+
140+
COPY hf-genai-requirements.txt .
141+
142+
RUN python -m pip install --no-cache-dir -r hf-genai-requirements.txt && \
143+
rm -rf hf-genai-requirements.txt
144+
138145
FROM ${PYTHON_BASE} AS ipex-xpu-base
139146

140147
RUN apt-get update && \

pytorch/README.md

+24
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,27 @@ Additionally, if you have a [DeepSpeed* configuration](https://www.deepspeed.ai/
239239

240240
---
241241

242+
#### Hugging Face Generative AI Container
243+
244+
The image below is an extension of the IPEX Multi-Node Container designed to run Hugging Face Generative AI scripts. The container has the typical installations needed to run and fine tune PyTorch generative text models from Hugging Face. It can be used to run multinode jobs using the same instructions from the [IPEX Multi-Node container](#setup-and-run-ipex-multi-node-container).
245+
246+
| Tag(s) | Pytorch | IPEX | oneCCL | transformers | Dockerfile |
247+
| --------------------- | -------- | ------------ | -------------------- | --------- | --------------- |
248+
| `2.3.0-pip-multinode-hf-4.41.2-genai` | [v2.3.1](https://github.com/pytorch/pytorch/releases/tag/v2.3.1) | [v2.3.0+cpu] | [v2.3.0][ccl-v2.3.0] | [v4.41.2] | [v0.4.0-Beta] |
249+
250+
Below is an example that shows single node job with the existing [`finetune.py`](../workflows/charts/huggingface-llm/scripts/finetune.py) script.
251+
252+
```bash
253+
# Change into home directory first and run the command
254+
docker run -it \
255+
-v $PWD/workflows/charts/huggingface-llm/scripts:/workspace/scripts \
256+
-w /workspace/scripts \
257+
intel/intel-extension-for-pytorch:2.3.0-pip-multinode-hf-4.41.2-genai \
258+
bash -c 'python finetune.py <script-args>'
259+
```
260+
261+
---
262+
242263
The images below are [TorchServe*] with CPU Optimizations:
243264

244265
| Tag(s) | Pytorch | IPEX | Dockerfile |
@@ -373,6 +394,9 @@ It is the image user's responsibility to ensure that any use of The images below
373394
[ccl-v2.1.0]: https://github.com/intel/torch-ccl/releases/tag/v2.1.0%2Bcpu
374395
[ccl-v2.0.0]: https://github.com/intel/torch-ccl/releases/tag/v2.1.0%2Bcpu
375396
397+
<!-- HuggingFace transformers releases -->
398+
[v4.41.2]: https://github.com/huggingface/transformers/releases/tag/v4.41.2
399+
376400
[803]: https://dgpu-docs.intel.com/releases/LTS_803.29_20240131.html
377401
[736]: https://dgpu-docs.intel.com/releases/stable_736_25_20231031.html
378402
[647]: https://dgpu-docs.intel.com/releases/stable_647_21_20230714.html

pytorch/docker-compose.yaml

+12
Original file line numberDiff line numberDiff line change
@@ -189,3 +189,15 @@ services:
189189
- 8082:8082
190190
- 7070:7070
191191
- 7071:7071
192+
hf-genai:
193+
build:
194+
args:
195+
HF_VERSION: ${HF_VERSION:-4.41.2}
196+
labels:
197+
dependency.python.pip: hf-genai-requirements.txt
198+
org.opencontainers.base.name: "intel/intel-optimized-pytorch:${IPEX_VERSION:-2.3.0}-${PACKAGE_OPTION:-pip}-multinode"
199+
org.opencontainers.image.title: "Intel® Extension for PyTorch MultiNode Huggingface Generative AI Image"
200+
org.opencontainers.image.version: ${IPEX_VERSION:-2.3.0}-${PACKAGE_OPTION:-pip}-multinode-hf-${HF_VERSION:-4.41.2}-genai"
201+
target: hf-genai
202+
extends: multinode
203+
image: ${REGISTRY}/${REPO}:b-${GITHUB_RUN_NUMBER:-0}-${BASE_IMAGE_NAME:-ubuntu}-${BASE_IMAGE_TAG:-22.04}-${PACKAGE_OPTION:-pip}-py${PYTHON_VERSION:-3.10}-ipex-${IPEX_VERSION:-2.3.0}-hf-${HF_VERSION:-4.41.2}

pytorch/hf-genai-requirements.txt

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
accelerate==0.28.0
2+
datasets==2.19.0
3+
einops==0.7.0
4+
evaluate==0.4.1
5+
nltk==3.8.1
6+
onnxruntime-extensions==0.10.1
7+
onnxruntime==1.17.3
8+
peft==0.10.0
9+
protobuf==4.24.4
10+
py-cpuinfo==9.0.0
11+
rouge_score==0.1.2
12+
scikit-learn==1.5.0
13+
SentencePiece==0.2.0
14+
tokenizers==0.19.1
15+
transformers==4.41.2

pytorch/tests/tests.yaml

+3
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,9 @@ import-xpu-jupyter-${PACKAGE_OPTION:-pip}:
2727
import-cpu-oneccl-${PACKAGE_OPTION:-pip}:
2828
img: ${REGISTRY}/${REPO}:b-${GITHUB_RUN_NUMBER:-0}-${BASE_IMAGE_NAME:-ubuntu}-${BASE_IMAGE_TAG:-22.04}-${PACKAGE_OPTION:-pip}-py${PYTHON_VERSION:-3.10}-ipex-${IPEX_VERSION:-2.3.0}-oneccl-inc-${INC_VERSION:-2.6}
2929
cmd: python -c "'import oneccl_bindings_for_pytorch as oneccl;print(oneccl.__version__)'"
30+
import-cpu-transformers-${PACKAGE_OPTION:-pip}:
31+
img: ${REGISTRY}/${REPO}:b-${GITHUB_RUN_NUMBER:-0}-${BASE_IMAGE_NAME:-ubuntu}-${BASE_IMAGE_TAG:-22.04}-${PACKAGE_OPTION:-pip}-py${PYTHON_VERSION:-3.10}-ipex-${IPEX_VERSION:-2.3.0}-hf-${HF_VERSION:-4.41.2}
32+
cmd: python -c "import transformers;print(f'transformers {transformers.__version__}');assert transformers.utils.import_utils.is_ipex_available()"
3033
import-cpu-inc-${PACKAGE_OPTION:-pip}:
3134
img: ${REGISTRY}/${REPO}:b-${GITHUB_RUN_NUMBER:-0}-${BASE_IMAGE_NAME:-ubuntu}-${BASE_IMAGE_TAG:-22.04}-${PACKAGE_OPTION:-pip}-py${PYTHON_VERSION:-3.10}-ipex-${IPEX_VERSION:-2.3.0}-oneccl-inc-${INC_VERSION:-2.6}
3235
cmd: python -c "'import neural_compressor as inc;print(inc.__version__)'"

0 commit comments

Comments
 (0)