Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use ViTMatte-B to inference Distinctions-646, A100(80G) out of memory #10

Open
tenzinOvO opened this issue Jul 19, 2023 · 9 comments
Open

Comments

@tenzinOvO
Copy link

No description provided.

@JingfengYao
Copy link
Member

We use the grid sample strategy to reduce the inference computation burden. Please take a look at the last section of our paper for detail. The code can be found here.

@tenzinOvO
Copy link
Author

tenzinOvO commented Jul 19, 2023

Thanks for your response. So if i want to reproduce the result on Distinctions-646 reported in paper, i need to replace the vit.py in ViTMatte with the vit.py in MatteAnything?

@JingfengYao
Copy link
Member

Yes. Or you can replace only the forward function in the Block. BTW, when you reproduce the results on Distinctions-646, the results will be influenced by the different trimaps you use.

@tenzinOvO
Copy link
Author

Thanks for your reminding. I was curious whether the pseudo trimap in MatteAnything(Table 1 2 3 4) was obtained through a real user study or a simulation of user interactions implemented by code based on ground truth.

@JingfengYao
Copy link
Member

It's a real user study.

@shiwanlin
Copy link

shiwanlin commented Oct 25, 2023

The matting outcome is markably better than matteformer. The big model is also visibly better than the small model. However, the memory requirement for inference is also markably bigger. For using the forward function in vit.py from this repo, it's 16x more for the big model and 8x more for the small model. After change the forward function to what is in MatteAnything, the big model still demands more memory than matteformer, which can run its inference with the same (hi-res) image in the same machine with about 80 GiB of GPU, and ViTMatte can't start the job (see below error message). Any other suggestion for further reducing the memory footprint?

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 53.78 GiB (GPU 0; 79.10 GiB total capacity; 68.61 GiB already allocated; 8.59 GiB free; 69.09 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@JingfengYao
Copy link
Member

ViTMatte's high memory requirement is mainly because of the attention mechanism in the ViT backbone. From my perspective, I may try memory efficient attention or flash attention to replace the original attention in ViT to further reduce the computation burden. (NOTE: Using different attention in inference may cause performance degradation since the inconsistency between training and inference.)

@felix-ky
Copy link

can you share the distinction-646 dataset? for some reasons, the author is not accessible now.

@almorozovv
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants