GPU accelerated video loading with optimizations for reading at specific timestamps or time intervals #8599

skier233 · 2024-08-17T17:20:22Z

🚀 The feature

Optimized video decoding and frame reading at regular time intervals or timestamp seeking for CPU and GPU accelerated video preprocessing.

Motivation, pitch

There are many usecases in ML where it may be beneficial to read in specific frames from a video rather than every frame. One such case is when running image classification models on a video and one frame at every x second interval in the video is far more efficient than processing every frame. Currently acquiring and resizing frames from a video at regular time intervals is incredibly slow with all CPU and GPU options offered by pytorch or torchvision. I've tried utilizing the NVDEC NVidia library as mentioned here and in other examples:
https://pytorch.org/audio/main/tutorials/nvdec_tutorial.html
This library is only optimized for consecutive frame reads though and is slower than using CPU with a more optimized open source library:
https://github.com/dmlc/decord
This library is significantly faster using CPU than either CPU or GPU options offered by pytorch, nvidia, or torchvision. However this library has not been updated in over 3 years and its GPU support has gotten lost to time. The community desperately needs options for efficient video decoding that use optimized patterns for timestamps or regular time interval frame reading and the only fast option available right now is a CPU only 3 year old library with no more support.

Alternatives

a no longer maintained old library that everyone is using that doesn't work with modern GPU's.

Additional context

No response

NicolasHug · 2024-08-19T09:02:00Z

Hi @skier233 ,
Thanks for opening this issue.
We have recently started https://github.com/pytorch/torchcodec which is where we want to consolidate the video decoding capabilities of pytorch. The library is currently in Beta stage, so there might be rough edges still, but when it's more mature we'll start deprecating the video decoders in torchvision in favor of torchcodec.
For now torchcodec only supports CPU decoding and Linux. GPU support and MacOS binaries should come soon.
If you try it, we'd love to hear any feedback you may have!

skier233 changed the title ~~GPU accelerated video loading with optimizations for reading at specific timestamps~~ GPU accelerated video loading with optimizations for reading at specific timestamps or time intervals Aug 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU accelerated video loading with optimizations for reading at specific timestamps or time intervals #8599

GPU accelerated video loading with optimizations for reading at specific timestamps or time intervals #8599

skier233 commented Aug 17, 2024

NicolasHug commented Aug 19, 2024

GPU accelerated video loading with optimizations for reading at specific timestamps or time intervals #8599

GPU accelerated video loading with optimizations for reading at specific timestamps or time intervals #8599

Comments

skier233 commented Aug 17, 2024

🚀 The feature

Motivation, pitch

Alternatives

Additional context

NicolasHug commented Aug 19, 2024