Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Regression in Whisper models when timestamp generation is enabled #1783

Open
MahmoudAshraf97 opened this issue Sep 18, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@MahmoudAshraf97
Copy link
Contributor

Hello
Several reports mention that WER improves greatly when adding <|notimestamps|> to the initial prompt in whisper decoding aka disabling timestamps generation, I tested this using This and This. You can check mobiusml/faster-whisper#18 (comment) for an example of decoding difference using the same encoder output
There are several other reports including but not limited to:
SYSTRAN/faster-whisper#1010
SYSTRAN/faster-whisper#985

Also generation with timestamps has a lower toks/s and the slowdown increases when increasing the batch size

on the side, we have several PRs waiting for @trungkienbkhn review but he seems to be out of office, it'd be great if one of his colleagues has any information when he might return

@minhthuc2502 minhthuc2502 added the enhancement New feature or request label Sep 20, 2024
@x86Gr
Copy link

x86Gr commented Sep 23, 2024

FYI, when using "without_timestamps=True" on faster-whisper 1.0.3 I get a faster speed but with a lot of skipped sentences.

@MahmoudAshraf97
Copy link
Contributor Author

I tested this completely independent from faster whisper to make sure it's purely related to CT2, in FW v1.0.3 it uses a batch size of 1 so you shouldn't notice slowdowns related to this options as they start to appear at larger batch sizes, so the missing sentences are probably caused by something else

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants