You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#8552 migrated all 3.8 jobs to 3.9. It took a while, and required a bunch of fixes. To avoid blocking it indefinitely, the PR was merged while Windows CUDA unittests jobs were still failing #8552 (comment).
So, the Windows CUDA unittests jobs are failing. And I don't know why.
File "C:\actions-runner\_work\vision\vision\pytorch\vision\test\smoke_test.py", line 113, in <module>
main()
File "C:\actions-runner\_work\vision\vision\pytorch\vision\test\smoke_test.py", line 85, in main
print(f"{torch.ops.image._jpeg_version() = }")
File "C:\Jenkins\Miniconda3\envs\ci\lib\site-packages\torch\_ops.py", line 1225, in __getattr__
raise AttributeError(
AttributeError: '_OpNamespace' 'image' object has no attribute '_jpeg_version'
More detailed error (modifying the import code to be more verbose):
torchvision\io\__init__.py:24: in <module>
from .image import (
torchvision\io\image.py:11: in <module>
_load_library("image")
torchvision\extension.py:89: in _load_library
torch.ops.load_library(lib_path)
C:\Jenkins\Miniconda3\envs\ci\lib\site-packages\torch\_ops.py:1350: in load_library
ctypes.CDLL(path)
C:\Jenkins\Miniconda3\envs\ci\lib\ctypes\__init__.py:374: in __init__
self._handle = _dlopen(self._name, mode)
E FileNotFoundError: Could not find module 'C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\image.pyd' (or one of its dependencies). Try using the full path with c
onstructor syntax.
Checking the dependencies of image.pyd gives:
runneruser@EC2AMAZ-HAC74MP /c/actions-runner/_work/vision/vision/pytorch/vision ((efd36a4...))
$ cygcheck.exe torchvision/image.pyd
C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\image.pyd C:\Jenkins\Miniconda3\envs\ci\Library\bin\libpng16.dll
C:\Jenkins\Miniconda3\envs\ci\zlib.dll C:\Windows\system32\VCRUNTIME140.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-runtime-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-heap-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-string-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-stdio-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-convert-l1-1-0.dll C:\Windows\system32\KERNEL32.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-rtlsupport-l1-1-0.dll C:\Windows\system32\ntdll.dll
C:\Windows\system32\KERNELBASE.dll C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\api-ms-win-eventing-provider-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-processthreads-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-processthreads-l1-1-1.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-heap-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-memory-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-handle-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-synch-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-synch-l1-2-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-file-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-file-l1-2-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-namedpipe-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-datetime-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-sysinfo-l1-2-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-sysinfo-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-timezone-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-localization-l1-2-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-processenvironment-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-string-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-debug-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-errorhandling-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-fibers-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-util-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-profile-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-file-l2-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-console-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-console-l1-2-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-math-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-filesystem-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-time-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\Library\bin\jpeg8.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-environment-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\Library\bin\libwebp.dll
C:\Jenkins\Miniconda3\envs\ci\Library\bin\libsharpyuv.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-utility-l1-1-0.dll
cygcheck: track_down: could not find nvjpeg64_11.dll
cygcheck: track_down: could not find c10.dll
cygcheck: track_down: could not find torch_cpu.dll
cygcheck: track_down: could not find cudart64_110.dll
cygcheck: track_down: could not find c10_cuda.dll
cygcheck: track_down: could not find torch_cuda.dll
C:\Windows\system32\MSVCP140.dll
C:\Windows\system32\VCRUNTIME140_1.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-locale-l1-1-0.dll
So it seems like nvjpeg and other cuda dependencies cannot be found. In b4c05786e6a7f8f6e1a01d3f9c7ccaf7de1c6830 I removed building with nvjpeg support, and could confirm that the import failure wasn't there anymore.
This seems to suggest that adding "/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.8/bin" to the PATH could prevent the problem. But I just tried that by ssh-ing on the machine, and I'm still getting the same error (and I confirm the PATH was OK by running cygcheck again, and confirmed that nvjpeg64_11.dll was found:
...
C:\Jenkins\Miniconda3\envs\ci\Library\bin\libwebp.dll
C:\Jenkins\Miniconda3\envs\ci\Library\bin\libsharpyuv.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-utility-l1-1-0.dll
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\nvjpeg64_11.dll
cygcheck: track_down: could not find c10.dll
cygcheck: track_down: could not find torch_cpu.dll
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\cudart64_110.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-interlocked-l1-1-0.dll
cygcheck: track_down: could not find c10_cuda.dll
cygcheck: track_down: could not find torch_cuda.dll
C:\Windows\system32\MSVCP140.dll
C:\Windows\system32\VCRUNTIME140_1.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-locale-l1-1-0.dll
#8552 migrated all 3.8 jobs to 3.9. It took a while, and required a bunch of fixes. To avoid blocking it indefinitely, the PR was merged while Windows CUDA unittests jobs were still failing #8552 (comment).
So, the Windows CUDA unittests jobs are failing. And I don't know why.
logs: https://github.com/pytorch/vision/actions/runs/10699721178/job/29661914922?pr=8623
More detailed error (modifying the import code to be more verbose):
Checking the dependencies of
image.pyd
gives:So it seems like nvjpeg and other cuda dependencies cannot be found. In b4c05786e6a7f8f6e1a01d3f9c7ccaf7de1c6830 I removed building with nvjpeg support, and could confirm that the import failure wasn't there anymore.
This seems to suggest that adding
"/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.8/bin"
to the PATH could prevent the problem. But I just tried that by ssh-ing on the machine, and I'm still getting the same error (and I confirm the PATH was OK by runningcygcheck
again, and confirmed thatnvjpeg64_11.dll
was found:CC @atalman @malfet
The text was updated successfully, but these errors were encountered: