You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Optimized video decoding and frame reading at regular time intervals or timestamp seeking for CPU and GPU accelerated video preprocessing.
Motivation, pitch
There are many usecases in ML where it may be beneficial to read in specific frames from a video rather than every frame. One such case is when running image classification models on a video and one frame at every x second interval in the video is far more efficient than processing every frame. Currently acquiring and resizing frames from a video at regular time intervals is incredibly slow with all CPU and GPU options offered by pytorch or torchvision. I've tried utilizing the NVDEC NVidia library as mentioned here and in other examples: https://pytorch.org/audio/main/tutorials/nvdec_tutorial.html
This library is only optimized for consecutive frame reads though and is slower than using CPU with a more optimized open source library: https://github.com/dmlc/decord
This library is significantly faster using CPU than either CPU or GPU options offered by pytorch, nvidia, or torchvision. However this library has not been updated in over 3 years and its GPU support has gotten lost to time. The community desperately needs options for efficient video decoding that use optimized patterns for timestamps or regular time interval frame reading and the only fast option available right now is a CPU only 3 year old library with no more support.
Alternatives
a no longer maintained old library that everyone is using that doesn't work with modern GPU's.
Additional context
No response
The text was updated successfully, but these errors were encountered:
skier233
changed the title
GPU accelerated video loading with optimizations for reading at specific timestamps
GPU accelerated video loading with optimizations for reading at specific timestamps or time intervals
Aug 17, 2024
Hi @skier233 ,
Thanks for opening this issue.
We have recently started https://github.com/pytorch/torchcodec which is where we want to consolidate the video decoding capabilities of pytorch. The library is currently in Beta stage, so there might be rough edges still, but when it's more mature we'll start deprecating the video decoders in torchvision in favor of torchcodec.
For now torchcodec only supports CPU decoding and Linux. GPU support and MacOS binaries should come soon.
If you try it, we'd love to hear any feedback you may have!
🚀 The feature
Optimized video decoding and frame reading at regular time intervals or timestamp seeking for CPU and GPU accelerated video preprocessing.
Motivation, pitch
There are many usecases in ML where it may be beneficial to read in specific frames from a video rather than every frame. One such case is when running image classification models on a video and one frame at every x second interval in the video is far more efficient than processing every frame. Currently acquiring and resizing frames from a video at regular time intervals is incredibly slow with all CPU and GPU options offered by pytorch or torchvision. I've tried utilizing the NVDEC NVidia library as mentioned here and in other examples:
https://pytorch.org/audio/main/tutorials/nvdec_tutorial.html
This library is only optimized for consecutive frame reads though and is slower than using CPU with a more optimized open source library:
https://github.com/dmlc/decord
This library is significantly faster using CPU than either CPU or GPU options offered by pytorch, nvidia, or torchvision. However this library has not been updated in over 3 years and its GPU support has gotten lost to time. The community desperately needs options for efficient video decoding that use optimized patterns for timestamps or regular time interval frame reading and the only fast option available right now is a CPU only 3 year old library with no more support.
Alternatives
a no longer maintained old library that everyone is using that doesn't work with modern GPU's.
Additional context
No response
The text was updated successfully, but these errors were encountered: