A Rust, Python and gRPC server for text generation inference by huggingface on Intel GPUs.
For more information about how to use Huggingface text-generation-inference with Intel optimizations, check out huggingface's documentation.
Tip
For Gaudi-related documentation, check out tgi-gaudi.
Key | Type | Default | Description |
---|---|---|---|
deploy.configMap | object | {"enabled":true,"name":"tgi-config"} |
ConfigMap of Environment Variables |
deploy.image | string | "ghcr.io/huggingface/text-generation-inference:latest-intel" |
Intel TGI Image |
deploy.replicaCount | int | 1 |
Number of pods |
deploy.resources | object | {"limits":{"cpu":"4000m","gpu.intel.com/i915":1},"requests":{"cpu":"1000m","memory":"1Gi"}} |
Resource configuration |
deploy.resources.limits."gpu.intel.com/i915" | int | 1 |
Intel GPU Device Configuration |
fullnameOverride | string | "" |
Full qualified Domain Name |
ingress | object | {"annotations":{},"className":"","enabled":false,"hosts":[{"host":"chart-example.local","paths":[{"path":"/","pathType":"ImplementationSpecific"}]}],"tls":[]} |
Ingress configuration |
nameOverride | string | "" |
Name of the serving service |
pvc.size | string | "15Gi" |
|
pvc.storageClassName | string | "nil" |
|
secret.encodedToken | string | "" |
Base64 Encoded Huggingface Hub API Token |
securityContext | object | {} |
Security Context Configuration |
service | object | {"port":80,"type":"NodePort"} |
Service configuration |
Autogenerated from chart metadata using helm-docs v1.14.2