You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to run inference with llama2+13b and I have 4 RTX3090 each with 24GB Memory, however I noticed that when I use the sample inference code, it only uses one GPU which causes out of memory error, any suggestion for this? (I have tried using accelerate but it didn't work, I guess the way I was using it was incorrect)
Thanks a lot!
The text was updated successfully, but these errors were encountered:
yunbinmo
changed the title
How could I utilise multiple GPUs
Using multiple GPUs for inference
Jul 6, 2024
Hi I am aware that you have a vlm-evaluation repo as well, but it seems to have a fixed set of datasets while I want to evaluate on my own datasets, could you advise how to do that on multiple GPUs using the scripts that you have given in the README?
I have tried accelerate config followed by accelerate launch --num_processes=4 infer.py, I have also done export CUDA_VISIBLE_DEVICES=0,1,2,3 but get the following error:
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Root Cause (first observed failure):
[0]:
time : 2024-07-13_21:21:29
host : xxx
rank : 3 (local_rank: 3)
exitcode : -9 (pid: xxx)
error_file: <N/A>
traceback : Signal 9 (SIGKILL) received by PID xxx
Hi,
I am trying to run inference with
llama2+13b
and I have 4 RTX3090 each with 24GB Memory, however I noticed that when I use the sample inference code, it only uses one GPU which causes out of memory error, any suggestion for this? (I have tried usingaccelerate
but it didn't work, I guess the way I was using it was incorrect)Thanks a lot!
The text was updated successfully, but these errors were encountered: