Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_validate_snapshot_available() failing although torchsnapshot is available #876

Open
nubertj opened this issue Aug 6, 2024 · 1 comment

Comments

@nubertj
Copy link

nubertj commented Aug 6, 2024

🐛 Describe the bug

When running my code with torchtnt and the TorchSnapshotSaver (torchsnapshot_saver.py), I get the following error after construction of the class:

RuntimeError: TorchSnapshotSaver support requires torchsnapshot. Please make sure ``torchsnapshot`` is installed. Installation: https://github.com/pytorch/torchsnapshot#install

This line can be found here.
However, torchsnapshot can be imported.

Versions

I tried installing torchsnapshot and torchtnt from conda, pypi, and directly from the github repos. I always get this result.

@elrnv
Copy link

elrnv commented Sep 21, 2024

I also ran into this.
It seems that torchsnapshot_saver.py is importing override_max_per_rank_io_concurrency from torchsnapshot.knobs, which is only available on the main branch and not in the 0.1.0 release.
Perhaps the simplest solution is to release another version of torchsnapshot, and constraint torchtnt to depend on that.

Edit: In the short term, installing torchsnapshot with pip install --pre torchsnapshot-nightly worked for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants