RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension for v2 transforms #8622

lxr2 · 2024-09-02T15:33:12Z

🐛 Describe the bug

It seems that v2.Pad does not support cases where the padding size is greater than the image size, but v1.Pad does support this. I hope that v2.Pad will allow this in the future as well.

from torchvision.transforms import v2
import torchvision.transforms as T
from torchvision.transforms import functional as F

orig_img = torch.rand([3,32,32])
orig_img = F.to_pil_image(orig_img)

# Not supported
trans_img = v2.Compose([v2.ToImage(), T.Pad(padding=36, padding_mode='reflect')])(orig_img)

# Supported
trans_img = T.Compose([T.Pad(padding=36, padding_mode='reflect')])(orig_img)

Versions

Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.4 LTS (x86_64)
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.35

Python version: 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0] (64-bit runtime)
Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3060 Laptop GPU
Nvidia driver version: 546.80
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
...
[conda] torch                     2.4.0                    pypi_0    pypi
[conda] torchmetrics              1.4.0.post0              pypi_0    pypi
[conda] torchvision               0.19.0                   pypi_0    pypi
[conda] triton                    3.0.0                    pypi_0    pypi

The text was updated successfully, but these errors were encountered:

venkatram-dev · 2024-09-02T21:11:56Z

Not sure of the reason to combine v1 and v2 together in v2.Compose([v2.ToImage(), T.Pad(padding=36,

T.Pad(

Below code works (tested in google colab) . Please try this.


from torchvision.transforms import v2 as T2
import torchvision.transforms.functional as F
import torch

orig_img = torch.rand([3,32,32])
orig_img = F.to_pil_image(orig_img)


# Using v2 API for padding
transform = T2.Compose([
    T2.Pad(padding=36, padding_mode='reflect'),  # Use v2.Pad directly
    #T2.ToTensor()
])

transform
# Apply transformation
trans_img = transform(orig_img)
trans_img

lxr2 · 2024-09-03T02:28:28Z

It works, but following the docs, it seems that the standard steps should include v2.ToImage() if the img is PIL format. I am confused about it.

This is what a typical transform pipeline could look like:

from torchvision.transforms import v2
transforms = v2.Compose([
    v2.ToImage(),  # Convert to tensor, only needed if you had a PIL image
    v2.ToDtype(torch.uint8, scale=True),  # optional, most input are already uint8 at this point
    # ...
    v2.RandomResizedCrop(size=(224, 224), antialias=True),  # Or Resize(antialias=True)
    # ...
    v2.ToDtype(torch.float32, scale=True),  # Normalize expects float input
    v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

venkatram-dev · 2024-09-03T06:39:23Z

Below is my understanding, others can chime in as needed :)

Yeah, that is a good point. In My opinion, May be that doc needs to be clear to specify the difference in padding operation done on pillow image and on a tensor.

May be that doc needs to be clear to specify the difference in padding operation done on pillow image and on a tensor.

If we look at other docs for padding, they have used pillow images. https://pytorch.org/vision/main/auto_examples/transforms/plot_transforms_illustrations.html#sphx-glr-auto-examples-transforms-plot-transforms-illustrations-py

Anyways, this is my understanding.

Extra padding (padding size greater than image size) works on a pillow image.

But extra padding does not work on a tensor.

So, if we need extra padding ,it has to be on pillow image.

We can then do the other tensor operations after it.

Root Cause Analysis :

Padding on pillow images uses pillow functions and numpy functions and do not do any checking on dimensions.

https://github.com/pytorch/vision/blob/main/torchvision/transforms/_functional_pil.py#L144-L220

padding on tensor uses pytorch code and does strict type checking for dimensions.

https://github.com/pytorch/pytorch/blob/d14fe3ffeddff743af09ce7c8d91127940ddf7ed/aten/src/ATen/native/ReflectionPad.cpp#L241-L249

My understanding is that PyTorch does these internal checks to prevent padding operations from exceeding the dimensions of a tensor, ensuring that all computations stay within the allocated memory bounds to avoid errors like crashes or data corruption.

Scenario 1 . Extra padding (padding size greater than image size) works on a pillow image.

from torchvision.transforms import v2 as T2
import torchvision.transforms.functional as F
import torch

orig_img = torch.rand([3,32,32])
orig_img = F.to_pil_image(orig_img)
print ('orig type',type(orig_img))
print ('orig shape',orig_img.size)


# Using v2 API for padding
transform = T2.Compose([
        T2.Pad(padding=36, padding_mode='reflect'),  # Use v2.Pad directly
    T2.ToImage(), 
    #T2.ToTensor()
])

#transform
# Apply transformation
trans_img = transform(orig_img)
print('trans_img type',type(trans_img))

print('trans_img shape',trans_img.shape)
trans_img

Above code works

Scenario 2 :
But extra padding does not work on a tensor.

from torchvision.transforms import v2 as T2
import torchvision.transforms.functional as F
import torch

orig_img = torch.rand([3,32,32])
orig_img = F.to_pil_image(orig_img)
print ('orig type',type(orig_img))
print ('orig shape',orig_img.size)

# Using v2 API for padding
transform = T2.Compose([
    T2.ToImage(), 
    T2.Pad(padding=36, padding_mode='reflect'),  # Use v2.Pad directly
])

#transform
# Apply transformation
trans_img = transform(orig_img)
trans_img.shape
print('trans_img type',type(trans_img))

print('trans_img shape',trans_img.shape)
trans_img

RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (36, 36) at dimension 3 of input [1, 3, 32, 32]

from torchvision.transforms import v2 as T2
import torch

# Create a random image tensor
orig_img = torch.rand([3, 32, 32])  # This is a tensor
print ('orig type',type(orig_img))
print ('orig shape',orig_img.shape)


# Define a transformation pipeline with v2 API
transform = T2.Compose([
    T2.Pad(padding=36, padding_mode='reflect'),  # Check if T2.Pad accepts tv_tensors.Image
])

# Apply the transformation
trans_img = transform(orig_img)
print('trans_img type',type(trans_img))

print('trans_img shape',trans_img.shape)
trans_img

RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (36, 36) at dimension 3 of input [1, 3, 32, 32]


from torchvision.transforms import v2 as T2
import torch

# Create a random image tensor
orig_img = torch.rand([3, 32, 32])  # This is a tensor
print ('orig type',type(orig_img))
print ('orig shape',orig_img.shape)


# Define a transformation pipeline with v2 API
transform = T2.Compose([
    T2.ToImage(),  # Convert tensor to tv_tensors.Image
    T2.Pad(padding=36, padding_mode='reflect'),  # Check if T2.Pad accepts tv_tensors.Image
])

# Apply the transformation
trans_img = transform(orig_img)
print('trans_img type',type(trans_img))

print('trans_img shape',trans_img.shape)
trans_img

RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (36, 36) at dimension 3 of input [1, 3, 32, 32]

Scenario 3:
Padding with size less than input dimension works on tensor

from torchvision.transforms import v2 as T2
import torchvision.transforms.functional as F
import torch

orig_img = torch.rand([3,32,32])
orig_img = F.to_pil_image(orig_img)
print ('orig type',type(orig_img))
print ('orig shape',orig_img.size)

# Using v2 API for padding
transform = T2.Compose([
    T2.ToImage(), 
    T2.Pad(padding=30, padding_mode='reflect'),  # Use v2.Pad directly
])


#transform
# Apply transformation
trans_img = transform(orig_img)
trans_img.shape
print('trans_img type',type(trans_img))

print('trans_img shape',trans_img.shape)
trans_img

Above code works


from torchvision.transforms import v2 as T2
import torch

# Create a random image tensor
orig_img = torch.rand([3, 32, 32])  # This is a tensor
print ('orig type',type(orig_img))
print ('orig shape',orig_img.shape)


# Define a transformation pipeline with v2 API
transform = T2.Compose([
    T2.Pad(padding=31, padding_mode='reflect'),  # Check if T2.Pad accepts tv_tensors.Image
])

# Apply the transformation
trans_img = transform(orig_img)
print('trans_img type',type(trans_img))

print('trans_img shape',trans_img.shape)
trans_img

Above code works

lxr2 · 2024-09-03T08:24:29Z

Many thanks, very clear explanations and instructions!

NicolasHug · 2024-10-11T13:28:31Z

Thanks for the report @lxr2 , and @venkatram-dev for the help.

Just to summarize: this isn't a v1 vs v2 issue. This is a difference in behavior between the PIL backend and the tensor backend (and this difference can be observed on both v1 and v2).

PIL supports the padding size to be larger than the image dimsions, while torchvision / pytorch doesn't.

simple reproducer:

import torch
from torchvision.transforms import functional as F
from torchvision.transforms.v2 import functional as F2

t = torch.rand(3,32,32)
pil_img = F.to_pil_image(t)

padding = 31  # fails for 32+ on tensors

trans_img = F.pad(pil_img, padding=padding, padding_mode='reflect')
print(trans_img.size)
trans_img = F2.pad(pil_img, padding=padding, padding_mode='reflect')
print(trans_img.size)
trans_img = F.pad(t, padding=padding, padding_mode='reflect')
print(trans_img.shape)
trans_img = F2.pad(t, padding=padding, padding_mode='reflect')
print(trans_img.shape)

Unfortunately, this isn't something we can directly address in torchvision, because the behavior is dictated by torch's pad. Note that there are similar discussions in pytorch/pytorch#18413 but at the time, it was suggested that the existing torch behavior is expected.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension for v2 transforms #8622

RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension for v2 transforms #8622

lxr2 commented Sep 2, 2024

venkatram-dev commented Sep 2, 2024 •

edited

Loading

lxr2 commented Sep 3, 2024

venkatram-dev commented Sep 3, 2024 •

edited

Loading

lxr2 commented Sep 3, 2024

NicolasHug commented Oct 11, 2024

RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension for v2 transforms #8622

RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension for v2 transforms #8622

Comments

lxr2 commented Sep 2, 2024

🐛 Describe the bug

Versions

venkatram-dev commented Sep 2, 2024 • edited Loading

lxr2 commented Sep 3, 2024

venkatram-dev commented Sep 3, 2024 • edited Loading

lxr2 commented Sep 3, 2024

NicolasHug commented Oct 11, 2024

venkatram-dev commented Sep 2, 2024 •

edited

Loading

venkatram-dev commented Sep 3, 2024 •

edited

Loading