From 2881f330f9ac61bf7bc02b08247262dcf1c1be77 Mon Sep 17 00:00:00 2001 From: Alvaro Moran Date: Wed, 18 Dec 2024 16:21:08 +0000 Subject: [PATCH] doc(v6e): mention initial v6e support --- README.md | 6 +++--- docs/source/howto/training.mdx | 5 ++++- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 70b38aa2..63aa887e 100644 --- a/README.md +++ b/README.md @@ -39,7 +39,7 @@ We currently support a few LLM models targeting text generation scenarios: ## Inference `optimum-tpu` provides a set of dedicated tools and integrations in order to leverage Cloud TPUs for inference, especially -on the latest TPU version `v5e`. +on the latest TPU version `v5e` and `v6e`. Other TPU versions will be supported along the way. @@ -64,8 +64,8 @@ To enable the support, export the environment variable `JETSTREAM_PT=1`. Fine-tuning is supported and tested on the TPU `v5e`. We have tested so far: -- 🦙 Llama-2 7B and Llama-3 8B -- 💎 Gemma 2B and 7B +- 🦙 Llama-2 7B, Llama-3 8B and newer; +- 💎 Gemma 2B and 7B. You can check the examples: diff --git a/docs/source/howto/training.mdx b/docs/source/howto/training.mdx index 15c33c26..fe7a3d7f 100644 --- a/docs/source/howto/training.mdx +++ b/docs/source/howto/training.mdx @@ -4,15 +4,18 @@ Welcome to the 🤗 Optimum-TPU training guide! This section covers how to fine- ## Currently Supported Models -The following models have been tested and validated for fine-tuning on TPU v5e: +The following models have been tested and validated for fine-tuning on TPU `v5e` and `v6e`: - 🦙 LLaMA Family - LLaMA-2 7B - LLaMA-3 8B + - LLaMA-3.2 1B - 💎 Gemma Family - Gemma 2B - Gemma 7B +Bigger models are supported, but not yet tested. + ## Getting Started ### Prerequisites