Skip to content

Commit 5d4ed87

Browse files
author
Tyler Titsworth
authored
Re-submit Intel TorchServe Chart (#103)
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
1 parent 8279154 commit 5d4ed87

File tree

12 files changed

+504
-60
lines changed

12 files changed

+504
-60
lines changed

pytorch/serving/README.md

-60
Original file line numberDiff line numberDiff line change
@@ -103,66 +103,6 @@ As demonstrated in the above example, models must be registered before they can
103103
}
104104
```
105105

106-
### Simple MaaS on K8s
107-
108-
Using the provided [helm chart](../charts/inference) your model can scale to multiple nodes in Kubernetes (K8s). Once you have set your `KUBECONFIG` environment variable and can access your cluster, use the below instructions to deploy your model as a service.
109-
110-
1. Install [Helm](https://helm.sh/docs/intro/install/)
111-
112-
```bash
113-
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 && \
114-
chmod 700 get_helm.sh && \
115-
./get_helm.sh
116-
```
117-
118-
2. (Optional) Push TorchServe Image to a Private Registry
119-
120-
If you added layers to an existing torchserve container image in a [previous step](#test-model), use `docker push` to add that image to a private registry that your cluster can access.
121-
122-
3. Set up Model Storage
123-
124-
Your model archive file will no longer be accessible from your local environment, so it needs to be added to a [PVC](https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/) using a network storage solution like [NFS](https://kubernetes.io/docs/concepts/storage/volumes/#nfs).
125-
126-
4. Install TorchServe Chart
127-
128-
Using the provided [Chart README](../charts/inference/README.md) set the variables found in the table to match the expected model storage, cluster type, and model configuration for your service. The example below assumes that a PVC has been created with the squeezenet model found in the root directory of the volume.
129-
130-
```bash
131-
helm install \
132-
--namespace=<namespace> \
133-
--set deploy.image=intel/intel-optimized-pytorch:2.2.0-serving-cpu \
134-
--set deploy.models='squeezenet=squeezenet1_1.mar' \
135-
--set deploy.storage.pvc.enable=true \
136-
--set deploy.storage.pvc.claimName=squeezenet \
137-
ipex-serving \
138-
../charts/inference
139-
```
140-
141-
5. Test Service
142-
143-
By default the service is a `NodePort` service, and is accessible from the ip address of any node in your cluster. Find a node ip with `kubectl get node -o wide` and attempt to communicate with service using the command below:
144-
145-
```bash
146-
curl -X GET http://<your-node-ip>:30000/ping
147-
curl -X GET http://<your-node-ip>:30001/models
148-
```
149-
150-
> [!NOTE]
151-
> If you are under a network proxy, you may need to unset your `http_proxy` and `no_proxy` to communicate with the nodes in your cluster with `curl`.
152-
153-
#### Next Steps
154-
155-
There are some additional steps that can be taken to prepare your service for your users:
156-
157-
- Enable [Autoscaling](https://github.com/pytorch/serve/blob/master/kubernetes/autoscale.md#autoscaler) via Prometheus
158-
- Enable [Intel GPU](https://github.com/intel/intel-device-plugins-for-kubernetes/blob/main/cmd/gpu_plugin/README.md#install-to-nodes-with-intel-gpus-with-fractional-resources)
159-
- Enable [Metrics](https://pytorch.org/serve/metrics.html) and [Metrics API](https://pytorch.org/serve/metrics_api.html).
160-
- Enable [Profiling](https://github.com/pytorch/serve/blob/master/docs/performance_guide.md#profiling).
161-
- Export an [INT8 Model for IPEX](https://github.com/pytorch/serve/blob/f7ae6f8281ac6e26404a6ae4d210535c9dc96d9a/examples/intel_extension_for_pytorch/README.md#creating-and-exporting-int8-model-for-intel-extension-for-pytorch)
162-
- Integrate an [Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) to your service to serve to a hostname rather than an ip address.
163-
- Integrate [MLFlow](https://github.com/mlflow/mlflow-torchserve).
164-
- Integrate an [SSL Certificate](https://pytorch.org/serve/configuration.html#enable-ssl) in your model config file to serve models securely.
165-
166106
### KServe
167107

168108
Apply Intel Optimizations to KServe by patching the serving runtimes to use Intel Optimized Serving Containers with `kubectl apply -f patch.yaml`
+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Patterns to ignore when building packages.
2+
# This supports shell glob matching, relative path matching, and
3+
# negation (prefixed with !). Only one pattern per line.
4+
.DS_Store
5+
# Common VCS dirs
6+
.git/
7+
.gitignore
8+
.bzr/
9+
.bzrignore
10+
.hg/
11+
.hgignore
12+
.svn/
13+
# Common backup files
14+
*.swp
15+
*.bak
16+
*.tmp
17+
*.orig
18+
*~
19+
# Various IDEs
20+
.project
21+
.idea/
22+
*.tmproj
23+
.vscode/
+42
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Copyright (c) 2024 Intel Corporation
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
apiVersion: v2
16+
name: intel-torchserve
17+
description: Intel TorchServe is a performant, flexible and easy to use tool for serving PyTorch models in production.
18+
19+
# A chart can be either an 'application' or a 'library' chart.
20+
#
21+
# Application charts are a collection of templates that can be packaged into versioned archives
22+
# to be deployed.
23+
#
24+
# Library charts provide useful utilities or functions for the chart developer. They're included as
25+
# a dependency of application charts to inject those utilities and functions into the rendering
26+
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
27+
maintainers:
28+
- name: tylertitsworth
29+
email: tyler.titsworth@intel.com
30+
url: https://github.com/tylertitsworth
31+
type: application
32+
33+
# This is the chart version. This version number should be incremented each time you make changes
34+
# to the chart and its templates, including the app version.
35+
# Versions are expected to follow Semantic Versioning (https://semver.org/)
36+
version: 0.1.0
37+
38+
# This is the version number of the application being deployed. This version number should be
39+
# incremented each time you make changes to the application. Versions are not expected to
40+
# follow Semantic Versioning. They should reflect the version the application is using.
41+
# It is recommended to use it with quotes.
42+
appVersion: "1.16.0"

workflows/charts/torchserve/README.md

+38
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Intel TorchServe
2+
3+
Intel TorchServe is a performant, flexible and easy to use tool for serving PyTorch models in production.
4+
5+
![Version: 0.1.0](https://img.shields.io/badge/Version-0.1.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.16.0](https://img.shields.io/badge/AppVersion-1.16.0-informational?style=flat-square)
6+
7+
## Values
8+
9+
| Key | Type | Default | Description |
10+
|-----|------|---------|-------------|
11+
| deploy.env | object | `{"configMapName":"intel-proxy-config","enabled":true}` | Add Environment mapping |
12+
| deploy.image | string | `"intel/intel-optimized-pytorch:2.3.0-serving-cpu"` | Intel Optimized torchserve image |
13+
| deploy.modelConfig | string | `"/home/model-server/config.properties"` | Model Server Configuration file location |
14+
| deploy.models | string | `"all"` | Models to be loaded |
15+
| deploy.replicas | int | `1` | Number of pods |
16+
| deploy.resources.limits | object | `{"cpu":"4000m","memory":"1Gi"}` | Maximum resources per pod |
17+
| deploy.resources.requests | object | `{"cpu":"1000m","memory":"512Mi"}` | Minimum resources per pod |
18+
| deploy.storage.nfs | object | `{"enabled":false,"path":"nil","readOnly":true,"server":"nil","subPath":"nil"}` | Network File System (NFS) storage for models |
19+
| fullnameOverride | string | `""` | Full qualified Domain Name |
20+
| nameOverride | string | `""` | Name of the serving service |
21+
| pvc.size | string | `"1Gi"` | Size of the storage |
22+
| service.type | string | `"NodePort"` | Type of service |
23+
24+
## Next Steps
25+
26+
There are some additional steps that can be taken to prepare your service for your users:
27+
28+
- Enable [Autoscaling](https://github.com/pytorch/serve/blob/master/kubernetes/autoscale.md#autoscaler) via Prometheus
29+
- Enable [Intel GPU](https://github.com/intel/intel-device-plugins-for-kubernetes/blob/main/cmd/gpu_plugin/README.md#install-to-nodes-with-intel-gpus-with-fractional-resources)
30+
- Enable [Metrics](https://pytorch.org/serve/metrics.html) and [Metrics API](https://pytorch.org/serve/metrics_api.html).
31+
- Enable [Profiling](https://github.com/pytorch/serve/blob/master/docs/performance_guide.md#profiling).
32+
- Export an [INT8 Model for IPEX](https://github.com/pytorch/serve/blob/f7ae6f8281ac6e26404a6ae4d210535c9dc96d9a/examples/intel_extension_for_pytorch/README.md#creating-and-exporting-int8-model-for-intel-extension-for-pytorch)
33+
- Integrate an [Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) to your service to serve to a hostname rather than an ip address.
34+
- Integrate [MLFlow](https://github.com/mlflow/mlflow-torchserve).
35+
- Integrate an [SSL Certificate](https://pytorch.org/serve/configuration.html#enable-ssl) in your model config file to serve models securely.
36+
37+
----------------------------------------------
38+
Autogenerated from chart metadata using [helm-docs v1.13.1](https://github.com/norwoodj/helm-docs/releases/v1.13.1)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Intel TorchServe
2+
3+
{{ template "chart.description" . }}
4+
5+
{{ template "chart.versionBadge" . }}{{ template "chart.typeBadge" . }}{{ template "chart.appVersionBadge" . }}
6+
7+
{{ template "chart.requirementsSection" . }}
8+
9+
{{ template "chart.valuesSection" . }}
10+
11+
## Next Steps
12+
13+
There are some additional steps that can be taken to prepare your service for your users:
14+
15+
- Enable [Autoscaling](https://github.com/pytorch/serve/blob/master/kubernetes/autoscale.md#autoscaler) via Prometheus
16+
- Enable [Intel GPU](https://github.com/intel/intel-device-plugins-for-kubernetes/blob/main/cmd/gpu_plugin/README.md#install-to-nodes-with-intel-gpus-with-fractional-resources)
17+
- Enable [Metrics](https://pytorch.org/serve/metrics.html) and [Metrics API](https://pytorch.org/serve/metrics_api.html).
18+
- Enable [Profiling](https://github.com/pytorch/serve/blob/master/docs/performance_guide.md#profiling).
19+
- Export an [INT8 Model for IPEX](https://github.com/pytorch/serve/blob/f7ae6f8281ac6e26404a6ae4d210535c9dc96d9a/examples/intel_extension_for_pytorch/README.md#creating-and-exporting-int8-model-for-intel-extension-for-pytorch)
20+
- Integrate an [Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) to your service to serve to a hostname rather than an ip address.
21+
- Integrate [MLFlow](https://github.com/mlflow/mlflow-torchserve).
22+
- Integrate an [SSL Certificate](https://pytorch.org/serve/configuration.html#enable-ssl) in your model config file to serve models securely.
23+
24+
{{ template "helm-docs.versionFooter" . }}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
1. Get the application URL by running these commands:
2+
{{- if contains "NodePort" .Values.service.type }}
3+
export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "torchserve.fullname" . }})
4+
export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
5+
echo http://$NODE_IP:$NODE_PORT
6+
{{- else if contains "LoadBalancer" .Values.service.type }}
7+
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
8+
You can watch its status by running 'kubectl get --namespace {{ .Release.Namespace }} svc -w {{ include "torchserve.fullname" . }}'
9+
export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "torchserve.fullname" . }} --template "{{"{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}"}}")
10+
echo http://$SERVICE_IP:30000/ping
11+
{{- else if contains "ClusterIP" .Values.service.type }}
12+
export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "app.kubernetes.io/name={{ include "torchserve.name" . }},app.kubernetes.io/instance={{ .Release.Name }}" -o jsonpath="{.items[0].metadata.name}")
13+
export CONTAINER_PORT=$(kubectl get pod --namespace {{ .Release.Namespace }} $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
14+
echo "Visit http://127.0.0.1:8080 to use your application"
15+
kubectl --namespace {{ .Release.Namespace }} port-forward $POD_NAME 8080:$CONTAINER_PORT
16+
{{- end }}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
{{/*
2+
Expand the name of the chart.
3+
*/}}
4+
{{- define "torchserve.name" -}}
5+
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
6+
{{- end }}
7+
8+
{{/*
9+
Create a default fully qualified app name.
10+
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
11+
If release name contains chart name it will be used as a full name.
12+
*/}}
13+
{{- define "torchserve.fullname" -}}
14+
{{- if .Values.fullnameOverride }}
15+
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
16+
{{- else }}
17+
{{- $name := default .Chart.Name .Values.nameOverride }}
18+
{{- if contains $name .Release.Name }}
19+
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
20+
{{- else }}
21+
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
22+
{{- end }}
23+
{{- end }}
24+
{{- end }}
25+
26+
{{/*
27+
Create chart name and version as used by the chart label.
28+
*/}}
29+
{{- define "torchserve.chart" -}}
30+
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
31+
{{- end }}
32+
33+
{{/*
34+
Common labels
35+
*/}}
36+
{{- define "torchserve.labels" -}}
37+
helm.sh/chart: {{ include "torchserve.chart" . }}
38+
{{ include "torchserve.selectorLabels" . }}
39+
{{- if .Chart.AppVersion }}
40+
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
41+
{{- end }}
42+
app.kubernetes.io/managed-by: {{ .Release.Service }}
43+
{{- end }}
44+
45+
{{/*
46+
Selector labels
47+
*/}}
48+
{{- define "torchserve.selectorLabels" -}}
49+
app.kubernetes.io/name: {{ include "torchserve.name" . }}
50+
app.kubernetes.io/instance: {{ .Release.Name }}
51+
{{- end }}
52+
53+
{{/*
54+
Create the name of the service account to use
55+
*/}}
56+
{{- define "torchserve.serviceAccountName" -}}
57+
{{- if .Values.serviceAccount.create }}
58+
{{- default (include "torchserve.fullname" .) .Values.serviceAccount.name }}
59+
{{- else }}
60+
{{- default "default" .Values.serviceAccount.name }}
61+
{{- end }}
62+
{{- end }}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# Copyright (c) 2024 Intel Corporation
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
apiVersion: apps/v1
16+
kind: Deployment
17+
metadata:
18+
name: {{ include "torchserve.fullname" . }}
19+
labels:
20+
{{- include "torchserve.labels" . | nindent 4 }}
21+
spec:
22+
replicas: {{ .Values.deploy.replicas }}
23+
selector:
24+
matchLabels:
25+
{{- include "torchserve.selectorLabels" . | nindent 6 }}
26+
template:
27+
metadata:
28+
labels:
29+
{{- include "torchserve.selectorLabels" . | nindent 8 }}
30+
spec:
31+
containers:
32+
- name: torchserve
33+
image: {{ .Values.deploy.image }}
34+
args:
35+
- 'torchserve'
36+
- '--start'
37+
- '--ts-config'
38+
- {{ .Values.deploy.modelConfig }}
39+
- '--model-store'
40+
- 'model-store'
41+
- '--workflow-store'
42+
- 'model-store'
43+
- '--models'
44+
- {{ .Values.deploy.models }}
45+
{{- if eq .Values.deploy.env.enabled true }}
46+
envFrom:
47+
- configMapRef:
48+
name: {{ .Values.deploy.env.configMapName }}
49+
{{- end }}
50+
ports:
51+
- name: rest-1
52+
containerPort: 8080
53+
- name: rest-2
54+
containerPort: 8081
55+
- name: rest-3
56+
containerPort: 8082
57+
- name: grpc-1
58+
containerPort: 7070
59+
- name: grpc-2
60+
containerPort: 7071
61+
volumeMounts:
62+
{{- if .Values.deploy.storage.nfs.enabled }}
63+
- name: model
64+
mountPath: /home/model-server/model-store
65+
subPath: {{ .Values.deploy.storage.nfs.subPath }}
66+
{{- else }}
67+
- name: model
68+
mountPath: /home/model-server/model-store
69+
{{- end }}
70+
resources:
71+
requests:
72+
cpu: {{ .Values.deploy.resources.requests.cpu }}
73+
memory: {{ .Values.deploy.resources.requests.memory }}
74+
limits:
75+
cpu: {{ .Values.deploy.resources.limits.cpu }}
76+
memory: {{ .Values.deploy.resources.limits.memory }}
77+
securityContext:
78+
fsGroup: 1000
79+
runAsUser: 1000
80+
volumes:
81+
{{- if .Values.deploy.storage.nfs.enabled }}
82+
- name: model
83+
nfs:
84+
server: {{ .Values.deploy.storage.nfs.server }}
85+
path: {{ .Values.deploy.storage.nfs.path }}
86+
readOnly: {{ .Values.deploy.storage.nfs.readOnly }}
87+
emptyDir: {}
88+
{{- else }}
89+
- name: model
90+
persistentVolumeClaim:
91+
claimName: {{ include "torchserve.fullname" . }}-model-dir
92+
{{- end }}

0 commit comments

Comments
 (0)