Skip to content

Best settings for speed in transformer/NER model in cloud GPU #9039

Discussion options

You must be logged in to vote

Sorry for the delayed reply on this, it's kind of a tricky question to answer.

The easiest way to make training on GPU faster is to reduce the number of training iterations. Obviously that also decreases the quality of your model.

Without sacrificing training iterations, your best bet is to maximize use of memory at any given time, so that you can cover more training examples with fewer batches. For that you should use the largest batch size you can get away with. For other parameters, they can make a difference, but how they interact and work with speed will be more complicated to define.

An SSD should not make a significant difference - usually the volume of data is small and it'll just…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@svlandeg
Comment options

@felipefoschiera
Comment options

Answer selected by svlandeg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
training Training and updating models feat / ner Feature: Named Entity Recognizer perf / speed Performance: speed feat / transformer Feature: Transformer
3 participants