Skip to content

Fine tuning pre-trained LLM for language translation and to build ChatGPT like application #334

Discussion options

You must be logged in to vote

Glad you liked the book!

Regarding your first question: I depends on the LLM and languages involved, but I'd say this is relatively straightforward. The key here is that the languages have been present in the pretraining dataset to create the tokenizer and pretrained LLM. Then, finetuning it for language translation (using the technique from Ch07) is relatively easy. The reason is that otherwise the tokenizer will break up a word into too many subtokens -- it will work but it not ideal. Base models that support multiple languages are for example Qwen 2 (~20 languages) and Llama 3.1 (~8 languages). (It's also possible to also extend existing tokenizers with new tokens but this is a separat…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by krishnadasar-sudheer-kumar
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants