Skip to content

Commit 92cd749

Browse files
authored
Document GPU Sharing Example (#261)
1 parent a7e5f94 commit 92cd749

File tree

1 file changed

+44
-0
lines changed

1 file changed

+44
-0
lines changed

examples/gpu-sharing/README.md

+44
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# GPU Sharing Example
2+
3+
This is an example of a truss that shares GPU across multiple models. This truss serves
4+
three models: a text to image model, an image to image model and an in-painting mode.
5+
Specific model can be targeted by providing `model_sub_name` in the request dictionary. Please
6+
refers to examples.yaml for examples.
7+
A model is loaded into GPU memory when needed. When a different model is invoked, then the previous
8+
one is offloaded and replaced with the new one. All models remain in the regular memory though, so that
9+
moving them to GPU memory later is faster.
10+
11+
## Usage
12+
Like any other truss, you can load this truss and test in-memory then deploy it to a cloud provider as follows:
13+
```
14+
models = truss.load("./") # This loads all of the models
15+
models.predict({
16+
"model_sub_name": "text_img", # Name of the model to sub into GPU
17+
"prompt": "red dog", # The rest of the kwargs are passed as input to that model
18+
})
19+
20+
import baseten
21+
baseten.deploy(models, model_name="Combo GPU Model", publish=True)
22+
```
23+
24+
To use the included examples, please run `git lfs pull` before loading them via `models.examples()`.
25+
26+
## Adding another model
27+
To add another model, do the following:
28+
1. Create a new file with your model class under the `model/` directory. We recommend that you copy [an existing model](./model/text_img_model.py) and modify it to suit your needs.
29+
2. Add the model to the registry in [`model/model.py`](./model/model.py#L16). The key you use here is what needs to be passed in as `model_sub_name` to use the models
30+
3. Update `config.yaml` to include any additional system or python requirements
31+
4. Follow the usage section for testing and deploying
32+
33+
## Custom Weights
34+
All the examples in this section load the weights during the load at runtime. You may desire to load weights into the model for a variety of reasons. To do so, do the following:
35+
1. Create a `data/` directory parallel to `model/`
36+
2. Add the weights there for any of the models. You can make different sub-directories or organize as you wish.
37+
3. In the model file for a single model, use the following line to get a reference to this folder
38+
```
39+
self._data_dir = kwargs["data_dir"]
40+
```
41+
4. In the `load` function, you this variable to load the weights. Here is an example from the diffusers library.
42+
```
43+
self._model = StableDiffusionPipeline.from_pretrained(str(self._data_dir / "path" / "to" / "weights")).to("cuda")
44+
```

0 commit comments

Comments
 (0)