-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GRU layer should have the batch_first=True flag #62
Comments
Hi @ra1995. Thank you for raising your concern. It has been more than 3 years since I developed this model, so I don't remember exactly how I was doing things. But after a brief look at the code I agree with you. It does seems that the batch_size was the first dimension, which seems common to me. Since the code did not break, I assume that this was probably the default situation in the PyTorch version used ... but I am not sure. |
Yes, the model was not converging correctly for my custom dataset without the batch_first argument. After doing the necessary changes, its performing much better |
Oh, that's very interesting! @ra1995, can you please make a PR with these changes? ( I could do it myself, but if you make a pull request - you will have the credit for finding this) |
Hi, I was going through the gesticulator codebase and using GRU for speech feature encoding. I noticed that before sending the curr_speech input to GRU, you keep the first dimension as the batch_size and the second dimension as the temporal size. So batch_first=True flag should be used to initialize GRU layer in my opinion. Please let me know if this is the case. Thank you for sharing your awesome work :)
The text was updated successfully, but these errors were encountered: