Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using multiple GPUs for inference #43

Open
yunbinmo opened this issue Jul 6, 2024 · 1 comment
Open

Using multiple GPUs for inference #43

yunbinmo opened this issue Jul 6, 2024 · 1 comment

Comments

@yunbinmo
Copy link

yunbinmo commented Jul 6, 2024

Hi,

I am trying to run inference with llama2+13b and I have 4 RTX3090 each with 24GB Memory, however I noticed that when I use the sample inference code, it only uses one GPU which causes out of memory error, any suggestion for this? (I have tried using accelerate but it didn't work, I guess the way I was using it was incorrect)

Thanks a lot!

@yunbinmo yunbinmo changed the title How could I utilise multiple GPUs Using multiple GPUs for inference Jul 6, 2024
@yunbinmo
Copy link
Author

yunbinmo commented Jul 13, 2024

Hi I am aware that you have a vlm-evaluation repo as well, but it seems to have a fixed set of datasets while I want to evaluate on my own datasets, could you advise how to do that on multiple GPUs using the scripts that you have given in the README?

I have tried accelerate config followed by accelerate launch --num_processes=4 infer.py, I have also done export CUDA_VISIBLE_DEVICES=0,1,2,3 but get the following error:

torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Root Cause (first observed failure):
[0]:
  time      : 2024-07-13_21:21:29
  host      : xxx
  rank      : 3 (local_rank: 3)
  exitcode  : -9 (pid: xxx)
  error_file: <N/A>
  traceback : Signal 9 (SIGKILL) received by PID xxx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant