You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi. I am facing issues with grad-cam with my custom VitModel. I followed the tutorial for Vision Transformer here and tried to adapt to my model. I managed to get dff working but grad-cam throws an error. The structure in the tutorial is different so I am unsure if I am choosing the wrong layer or there is something wrong with the input tensor. I also tried solution here but it doesn't work. I am using google/vit-base-patch16-224-in21k.
Thank you.
This is the error:
An exception occurred in CAM with block: <class 'numpy.exceptions.AxisError'>. Message: axis 2 is out of bounds for array of dimension 0
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[149], line 27
17 tensor_resized1 = tensor_resized1
19 display(Image.fromarray(run_dff_on_image(model=vit_model.model,
20 target_layer=target_layer_dff,
21 classifier=vit_model.classifier,
(...)
25 n_components=3,
26 top_k=3)))
---> 27 display(Image.fromarray(run_grad_cam_on_image(model=vit_model.model,
28 target_layer=target_layer_gradcam,
29 targets_for_gradcam=targets_for_gradcam,
30 input_tensor=tensor_resized1,
31 input_image=image_resized1,
32 reshape_transform=reshape_transform_vit_huggingface)))
33 print_top_categories(model, tensor_resized1)
File ~/anaconda3/envs/env-pytorch/lib/python3.10/site-packages/PIL/Image.py:3119, in fromarray(obj, mode)
3072 def fromarray(obj, mode=None):
3073 """
3074 Creates an image memory from an object exporting the array interface
3075 (using the buffer protocol)::
(...)
3117 .. versionadded:: 1.1.6
3118 """
-> 3119 arr = obj.__array_interface__
3120 shape = arr["shape"]
3121 ndim = len(shape)
AttributeError: 'NoneType' object has no attribute '__array_interface__'
Code for grad-cam:
def reshape_transform(tensor, height=14, width=14):
result = tensor[:, 1:, :].reshape(tensor.size(0),
height, width, tensor.size(2))
# Bring the channels to the first dimension,
# like in CNNs.
result = result.transpose(2, 3).transpose(1, 2)
return result
target_layer_dff = vit_model.model.layernorm
target_layer_gradcam = vit_model.model.encoder.layer[-1].layernorm_before
image_resized1 = pil_img.resize((224, 224))
tensor_resized1 = transforms.ToTensor()(image_resized1)
tensor_resized1 = tensor_resized1
display(Image.fromarray(run_dff_on_image(model=vit_model.model,
target_layer=target_layer_dff,
classifier=vit_model.classifier,
img_pil=image_resized1,
img_tensor=tensor_resized1,
reshape_transform=reshape_transform,
n_components=3,
top_k=3)))
display(Image.fromarray(run_grad_cam_on_image(model=vit_model.model,
target_layer=target_layer_gradcam,
targets_for_gradcam=targets_for_gradcam,
input_tensor=tensor_resized1,
input_image=image_resized1,
reshape_transform=reshape_transform)))
print_top_categories(model, tensor_resized1)
Hi. I am facing issues with grad-cam with my custom VitModel. I followed the tutorial for Vision Transformer here and tried to adapt to my model. I managed to get dff working but grad-cam throws an error. The structure in the tutorial is different so I am unsure if I am choosing the wrong layer or there is something wrong with the input tensor. I also tried solution here but it doesn't work. I am using google/vit-base-patch16-224-in21k.
Thank you.
This is the error:
Code for grad-cam:
torch.nn.Module
structure
The text was updated successfully, but these errors were encountered: