|
| 1 | +# GPU Sharing Example |
| 2 | + |
| 3 | +This is an example of a truss that shares GPU across multiple models. This truss serves |
| 4 | +three models: a text to image model, an image to image model and an in-painting mode. |
| 5 | +Specific model can be targeted by providing `model_sub_name` in the request dictionary. Please |
| 6 | +refers to examples.yaml for examples. |
| 7 | +A model is loaded into GPU memory when needed. When a different model is invoked, then the previous |
| 8 | +one is offloaded and replaced with the new one. All models remain in the regular memory though, so that |
| 9 | +moving them to GPU memory later is faster. |
| 10 | + |
| 11 | +## Usage |
| 12 | +Like any other truss, you can load this truss and test in-memory then deploy it to a cloud provider as follows: |
| 13 | +``` |
| 14 | +models = truss.load("./") # This loads all of the models |
| 15 | +models.predict({ |
| 16 | + "model_sub_name": "text_img", # Name of the model to sub into GPU |
| 17 | + "prompt": "red dog", # The rest of the kwargs are passed as input to that model |
| 18 | +}) |
| 19 | +
|
| 20 | +import baseten |
| 21 | +baseten.deploy(models, model_name="Combo GPU Model", publish=True) |
| 22 | +``` |
| 23 | + |
| 24 | +To use the included examples, please run `git lfs pull` before loading them via `models.examples()`. |
| 25 | + |
| 26 | +## Adding another model |
| 27 | +To add another model, do the following: |
| 28 | +1. Create a new file with your model class under the `model/` directory. We recommend that you copy [an existing model](./model/text_img_model.py) and modify it to suit your needs. |
| 29 | +2. Add the model to the registry in [`model/model.py`](./model/model.py#L16). The key you use here is what needs to be passed in as `model_sub_name` to use the models |
| 30 | +3. Update `config.yaml` to include any additional system or python requirements |
| 31 | +4. Follow the usage section for testing and deploying |
| 32 | + |
| 33 | +## Custom Weights |
| 34 | +All the examples in this section load the weights during the load at runtime. You may desire to load weights into the model for a variety of reasons. To do so, do the following: |
| 35 | +1. Create a `data/` directory parallel to `model/` |
| 36 | +2. Add the weights there for any of the models. You can make different sub-directories or organize as you wish. |
| 37 | +3. In the model file for a single model, use the following line to get a reference to this folder |
| 38 | + ``` |
| 39 | + self._data_dir = kwargs["data_dir"] |
| 40 | + ``` |
| 41 | +4. In the `load` function, you this variable to load the weights. Here is an example from the diffusers library. |
| 42 | + ``` |
| 43 | + self._model = StableDiffusionPipeline.from_pretrained(str(self._data_dir / "path" / "to" / "weights")).to("cuda") |
| 44 | + ``` |
0 commit comments