From 429f7bdd5b296a11f9da23120dfe33fd24293334 Mon Sep 17 00:00:00 2001
From: Aymeric Roucher <69208727+aymeric-roucher@users.noreply.github.com>
Date: Thu, 19 Sep 2024 10:57:37 +0200
Subject: [PATCH] Set context length in supported-models.md to the length that
 we effectively serve in API (#1417)

* Change context length to the effectively supported context length
---
 docs/api-inference/supported-models.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/api-inference/supported-models.md b/docs/api-inference/supported-models.md
index b8c82c207..26b45fe8f 100644
--- a/docs/api-inference/supported-models.md
+++ b/docs/api-inference/supported-models.md
@@ -13,9 +13,9 @@ You can find:
 In addition to thousands of public models available in the Hub, PRO and Enterprise users get higher [rate limits](./rate-limits) and free access to the following models:
 
 
-| Model                          | Size                                                                                                                                                                                       | Context Length | Use                                                          |
+| Model                          | Size                                                                                                                                                                                       | Supported Context Length | Use                                                          |
 |--------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------|--------------------------------------------------------------|
-| Meta Llama 3.1 Instruct  | [8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct), [70B](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct)                                                      | 128k tokens      | High quality multilingual chat model with large context length |
+| Meta Llama 3.1 Instruct  | [8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct), [70B](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct)                                                      | 70B: 32k tokens / 8B: 8k tokens | High quality multilingual chat model with large context length |
 | Meta Llama 3 Instruct          | [8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), [70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)                                                       | 8k tokens      | One of the best chat models                                  |
 | Llama 2 Chat                   | [7B](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf), [13B](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf), [70B](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf) | 4k tokens      | One of the best conversational models                        |
 | Bark                           | [0.9B](https://huggingface.co/suno/bark)                                                                                                                                                   | -              | Text to audio generation                                     |