You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’m currently exploring how to visualize the heatmap on LLAVA or other kinds of multimodal large language model to understand the model’s focus during text generation. I am familiar with using Grad-CAM for single-target classification tasks. However, with LLAVA generating complete sentences, I’m unsure how to obtain heatmaps for individual words. Could you provide any guidance or advice on how to approach this?
The text was updated successfully, but these errors were encountered:
Hello,
I’m currently exploring how to visualize the heatmap on LLAVA or other kinds of multimodal large language model to understand the model’s focus during text generation. I am familiar with using Grad-CAM for single-target classification tasks. However, with LLAVA generating complete sentences, I’m unsure how to obtain heatmaps for individual words. Could you provide any guidance or advice on how to approach this?
The text was updated successfully, but these errors were encountered: