Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Transformations behaviour possibly not expected when predicting with fresh dataloader #2254

Open
1 task done
blaz-r opened this issue Aug 18, 2024 · 4 comments
Open
1 task done

Comments

@blaz-r
Copy link
Contributor

blaz-r commented Aug 18, 2024

Describe the bug

Inside the engine code the trasnforms from the datamodule and dataloader are taken before the ones from the model:

# get transform
if datamodule and datamodule.transform:
# a transform passed explicitly to the datamodule takes precedence
transform = datamodule.transform
elif dataloaders and any(getattr(dl.dataset, "transform", None) for dl in dataloaders):
# if dataloaders are provided, we use the transform from the first dataloader that has a transform
transform = next(dl.dataset.transform for dl in dataloaders if getattr(dl.dataset, "transform", None))
elif ckpt_path is not None:
# if a checkpoint path is provided, we can load the transform from the checkpoint
checkpoint = torch.load(ckpt_path, map_location=model.device)
transform = checkpoint["transform"]
elif model.transform is None:
# if no transform is provided, we use the default transform from the model
image_size = datamodule.image_size if datamodule else None
transform = model.configure_transforms(image_size)
else:
transform = model.transform

This can cause problems if datamodule was never used inside the trainer. In that case looking at the following code, the transforms returned are just resize (since there is no trainer to take the correct model transforms from). This leads to missing the potential normalization from the model:

def train_transform(self) -> Transform:
"""Get the transforms that will be passed to the train dataset.
If the train_transform is not set, the engine will request the transform from the model.
"""
if self._train_transform:
return self._train_transform
if getattr(self, "trainer", None) and self.trainer.model and self.trainer.model.transform:
return self.trainer.model.transform
if self.image_size:
return Resize(self.image_size, antialias=True)
return None

This happens only if you are calling setup and dataloader from the datamodule outside trainer (as that is not set in this case) like this:

datamodule= MVTec(..)
datamodule.setup() # this calls self.train_transforms mentioned above
datamodule.test_dataloader() # only resize transofmr here

There is also code below to reproduce this.

A workaround I have at the moment is doing this:

dataloader.dataset.transform = None

This way the dataloader transforms are ignored and the ones from model are taken. Another possibility that I see is by manually setting the transforms of the datamodule with model.configure_transforms(image_size).

Dataset

N/A

Model

N/A

Steps to reproduce the behavior

    #### CONFIGURE THIS ####
    mvtec_path = "../datasets/MVTec"
    #####

    data = MVTec(root=mvtec_path, image_size=(42, 42), category="bottle", num_workers=0)
    data.setup()

    print(data.test_dataloader().dataset.transform)

OS information

OS information:

  • OS: Win11
  • Python version: 3.10.3
  • Anomalib version: 1.2.0dev
  • PyTorch version: 2.3.0
  • CUDA/cuDNN version: x
  • GPU models and configuration: x
  • Any other relevant information: x

Expected behavior

I would expect that in this case, the model transforms would take priority over the ones in dataloader, but I see how this would cause trouble in case of custom transforms inside datamodule.

Screenshots

No response

Pip/GitHub

GitHub

What version/branch did you use?

1.2.0dev

Configuration YAML

/

Logs

/

Code of Conduct

  • I agree to follow this project's Code of Conduct
@samet-akcay
Copy link
Contributor

@djdameln any thoughts?

@blaz-r
Copy link
Contributor Author

blaz-r commented Sep 25, 2024

Is there any updates on this one?

@samet-akcay
Copy link
Contributor

@blaz-r, not to address this one specifically, but @djdameln is working on some changes, which might address this

@blaz-r
Copy link
Contributor Author

blaz-r commented Sep 25, 2024

Okay. I'll use the workaround for now and we'll see when those changes are added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants