One question about running inference #40

Li-private · 2024-07-02T08:15:05Z

Thanks for your great work， but due to my poor knowledge， what's the 'hf_token' in your code of inference? How can I get it?

yunbinmo · 2024-07-06T13:41:31Z

You can get a token here https://huggingface.co/settings/tokens and put it in a hf_token.txt and then replace .hf_token in the code with that.

Li-private · 2024-07-06T13:58:46Z

You can get a token here https://huggingface.co/settings/tokens and put it in a hf_token.txt and then replace .hf_token in the code with that.

Thank you very much，I had tried to run inference with llama2+7b and I have 1 A100 with 80GB Memory，however when I used the inference code on GitHub I've noticed that loading the pre-trained weights for SigLIP, DINOv2, llama2-7b, and this model is quite time-consuming, especially the weights for llama2-7b. Do you have any good solutions?

yunbinmo · 2024-07-06T14:00:42Z

You can get a token here https://huggingface.co/settings/tokens and put it in a hf_token.txt and then replace .hf_token in the code with that.

Thank you very much，I had tried to run inference with llama2+7b and I have 1 A100 with 80GB Memory，however when I used the inference code on GitHub I've noticed that loading the pre-trained weights for SigLIP, DINOv2, llama2-7b, and this model is quite time-consuming, especially the weights for llama2-7b. Do you have any good solutions?

I noticed that too, I am using RTX3090 with 24GB memory, pretty slow for me too.

Li-private · 2024-07-06T14:06:45Z

You can get a token here https://huggingface.co/settings/tokens and put it in a hf_token.txt and then replace .hf_token in the code with that.

Thank you very much，I had tried to run inference with llama2+7b and I have 1 A100 with 80GB Memory，however when I used the inference code on GitHub I've noticed that loading the pre-trained weights for SigLIP, DINOv2, llama2-7b, and this model is quite time-consuming, especially the weights for llama2-7b. Do you have any good solutions?

I noticed that too, I am using RTX3090 with 24GB memory, pretty slow for me too.

It's so sad.

Li-private closed this as completed Jul 3, 2024

Li-private reopened this Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

One question about running inference #40

One question about running inference #40

Li-private commented Jul 2, 2024

yunbinmo commented Jul 6, 2024

Li-private commented Jul 6, 2024

yunbinmo commented Jul 6, 2024

Li-private commented Jul 6, 2024

One question about running inference #40

One question about running inference #40

Comments

Li-private commented Jul 2, 2024

yunbinmo commented Jul 6, 2024

Li-private commented Jul 6, 2024

yunbinmo commented Jul 6, 2024

Li-private commented Jul 6, 2024