How does CUDA weights get transfered to pytorch? #432
Replies: 2 comments
-
Maybe the overloading of the persistent weights makes some troubles, but in principle as long as you set the weights in the update_once they should be reflected in the python code. It might be though that using the persistent weight functionality for another purpose is troublesome since you need to carefully override all base methods that assume that the persistent weight is something else. Maybe you forgot something there ? Also, having 60+ global parameters is maybe not a good idea, there is actually a limit on the number. Do you really need that many? These all will be transferred to GPU and it will slow down the calculation. I am not sure whether you want to make your new device a public contribution to the code base, but if you want to do that, you could open a PR request and I could take a look where there might be problems and give some suggestions. CUDA progamming can be tricky. |
Beta Was this translation helpful? Give feedback.
-
Dear @maljoras We do intend to make my new device publicly available, but ideally after we publish it at a conference in October. (It’s a shame that the window for adding authors has passed, I would have gladly added you for the contribution.) I have created a mirror repo on GitHub and sent you an invite. Could you please help me check on it?
I override these functions and make these arrays protected so that I can modify them in the child class device.
I also double-checked the unnecessary global parameters, and shrink the number to 53. But they are needed since this is a very complex model. Regarding this, I also see that in the PiecewiseStep device it's required that the global_params_count has to be factors of two. Is that a CUDA requirement? Or is it just for the convenience of PiecewiseStep look-up tables?
In the long term, I do wish to merge into the master branch of aihwkit to promote ease of using, but there might be a few legal issues to be sorted out. We are also trying to go through the steps to establish a collaboration with the Neuromorphic Devices and Systems Group of IBM Research Zürich. We are planning to use aihwkit with this new model to fit with their RRAM device and explore better programming algorithms at the edge. I wish the paperwork can went through and all of us can benefit from this eventually. Thanks for all the help along the way! |
Beta Was this translation helpful? Give feedback.
-
Dear @maljoras
Thanks for answering all my questions.
I was debugging my CUDA device when I noticed this weird behavior that my weights are not being updated.
And I got:
So I went to my UpdateFunctor and tried to print out the weights there:
And I was surprised to see that my weights do get updated internally
How could it behave this way? Is there any intermediate step to transfer the weights from CUDA to PyTorch that I am missing out on?
The CPU version of the exact same PyTorch setup works just fine, so as far as I understand, the config parameters should be correct as well.
Thank you so much for your patience.
Zhenming
Beta Was this translation helpful? Give feedback.
All reactions