Using multiple GPUs for inference #43

yunbinmo · 2024-07-06T13:39:55Z

Hi,

I am trying to run inference with llama2+13b and I have 4 RTX3090 each with 24GB Memory, however I noticed that when I use the sample inference code, it only uses one GPU which causes out of memory error, any suggestion for this? (I have tried using accelerate but it didn't work, I guess the way I was using it was incorrect)

Thanks a lot!

The text was updated successfully, but these errors were encountered:

yunbinmo · 2024-07-13T13:27:16Z

Hi I am aware that you have a vlm-evaluation repo as well, but it seems to have a fixed set of datasets while I want to evaluate on my own datasets, could you advise how to do that on multiple GPUs using the scripts that you have given in the README?

I have tried accelerate config followed by accelerate launch --num_processes=4 infer.py, I have also done export CUDA_VISIBLE_DEVICES=0,1,2,3 but get the following error:

torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Root Cause (first observed failure):
[0]:
  time      : 2024-07-13_21:21:29
  host      : xxx
  rank      : 3 (local_rank: 3)
  exitcode  : -9 (pid: xxx)
  error_file: <N/A>
  traceback : Signal 9 (SIGKILL) received by PID xxx

yunbinmo changed the title ~~How could I utilise multiple GPUs~~ Using multiple GPUs for inference Jul 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using multiple GPUs for inference #43

Using multiple GPUs for inference #43

yunbinmo commented Jul 6, 2024

yunbinmo commented Jul 13, 2024 •

edited

Loading

Using multiple GPUs for inference #43

Using multiple GPUs for inference #43

Comments

yunbinmo commented Jul 6, 2024

yunbinmo commented Jul 13, 2024 • edited Loading

yunbinmo commented Jul 13, 2024 •

edited

Loading