-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llama 3.1: The output text is truncated #1153
Comments
while there are many possible issues with the environment which i cannot assist you virtually Model Configuration The configuration of the language model, such as the maximum token limit set for generation, can lead to truncation. If the max_gen_length parameter is set to a low value, the output will be cut off after reaching that limit. the source code specifically sets this parameter to 64 as default |
usage of a cpu may also be a reason for the truncated output. Running a large language model on a CPU can be resource-intensive. If the system runs out of memory or CPU resources, it might truncate the output to prevent crashes or excessive lag. |
Anyone managed to solve this? |
Describe the bug
Found a similar issue with Llama 2 #717, but this is for Llama 3.1.
The output text is cut off and cannot see the entire text result.
Is there a way to extend the max length of the output text? What is the default max length?
Minimal reproducible example
Output
Runtime Environment
meta-llama/Meta-Llama-3.1-8B
Additional context
Add any other context about the problem or environment here.
The text was updated successfully, but these errors were encountered: