Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance degradation in to_pil_image after v0.17 #8669

Open
seymurkafkas opened this issue Oct 2, 2024 · 5 comments
Open

performance degradation in to_pil_image after v0.17 #8669

seymurkafkas opened this issue Oct 2, 2024 · 5 comments

Comments

@seymurkafkas
Copy link

seymurkafkas commented Oct 2, 2024

🐛 Describe the bug

torchvision.transforms.functional.to_pil_image is much slower when converting torch.float16 image tensors to PIL Images based on my benchmarks (serializing 360 images):

Dependencies:

Python 3.11
Pillow 10.4.0

Before (torch 2.0.1, torchvision v0.15.2, Code here): 23 seconds
After ( torch 2.2.0, torchvision v0.17, Code here): 53 seconds

How to reproduce:

import torch
from torchvision.transforms.functional import to_pil_image

rand_img_tensor = torch.rand(3, 512, 512, dtype=torch.float16)
start_time = time.time()
for _ in range(50):
    pil_img = to_pil_image(rand_img_tensor)

end_time = time.time()
print(end_time - start_time) # seconds

Run the above script with both versions of dependencies listed, and the time difference is apparent.

The cause seems to be this PR

@seymurkafkas
Copy link
Author

seymurkafkas commented Oct 2, 2024

Most of the extra time is spent on this line:

    if np.issubdtype(npimg.dtype, np.floating) and mode != "F":
        npimg = (npimg * 255).astype(np.uint8)

I think it's due to multiplication using numpy primitives rather than torch (and also astype instead of torch.Tensor.byte())

@NicolasHug
Copy link
Member

Thanks for the report @seymurkafkas .

I think it's due to multiplication using numpy primitives rather than torch (and also astype instead of torch.Tensor.byte())

Ah, if that's the case then the fix might be non-trivial, this it means we'd have to go from a unified numpy logic to a unified pytorch logic. I'm happy to consider a PR if we can keep the code simple enough.

Out of curiosity, why do you need to convert tensors back to PIL, and more specifically, why do you need that part to be fast?

@seymurkafkas
Copy link
Author

seymurkafkas commented Oct 25, 2024

Thanks for the report @seymurkafkas .

I think it's due to multiplication using numpy primitives rather than torch (and also astype instead of torch.Tensor.byte())

Ah, if that's the case then the fix might be non-trivial, this it means we'd have to go from a unified numpy logic to a unified pytorch logic. I'm happy to consider a PR if we can keep the code simple enough.

Thanks for the response! I will take a look and submit a PR if possible.

why do you need to convert tensors back to PIL and why do you need that part to be fast?

This is to reduce inference costs for our ML app; less time spent on serialization implies more GPU utilization. We convert to PIL because we use it before serializing to disk.

@NicolasHug
Copy link
Member

Thanks for replying! Just so you know and if that's helpful, you may be able to use the encode_jpeg() or encode_png() utilities of torchvision! https://pytorch.org/vision/stable/io.html#image-encoding

@seymurkafkas
Copy link
Author

Thanks for replying! Just so you know and if that's helpful, you may be able to use the encode_jpeg() or encode_png() utilities of torchvision! https://pytorch.org/vision/stable/io.html#image-encoding

Thanks a lot for the tip :) I will experiment with those too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants