diff --git a/docs/api-inference/supported-models.md b/docs/api-inference/supported-models.md index b8c82c207..26b45fe8f 100644 --- a/docs/api-inference/supported-models.md +++ b/docs/api-inference/supported-models.md @@ -13,9 +13,9 @@ You can find: In addition to thousands of public models available in the Hub, PRO and Enterprise users get higher [rate limits](./rate-limits) and free access to the following models: -| Model | Size | Context Length | Use | +| Model | Size | Supported Context Length | Use | |--------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------|--------------------------------------------------------------| -| Meta Llama 3.1 Instruct | [8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct), [70B](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct) | 128k tokens | High quality multilingual chat model with large context length | +| Meta Llama 3.1 Instruct | [8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct), [70B](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct) | 70B: 32k tokens / 8B: 8k tokens | High quality multilingual chat model with large context length | | Meta Llama 3 Instruct | [8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), [70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) | 8k tokens | One of the best chat models | | Llama 2 Chat | [7B](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf), [13B](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf), [70B](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf) | 4k tokens | One of the best conversational models | | Bark | [0.9B](https://huggingface.co/suno/bark) | - | Text to audio generation |