Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model not able to quantize #1354

Open
alielfilali01 opened this issue Sep 12, 2024 · 0 comments
Open

Model not able to quantize #1354

alielfilali01 opened this issue Sep 12, 2024 · 0 comments

Comments

@alielfilali01
Copy link

System Info

  • Accelerate version: 0.34.2
  • Platform: Linux-5.15.0-91-generic-x86_64-with-glibc2.35
  • accelerate bash location: ~/miniconda3/envs/trl/bin/accelerate
  • Python version: 3.10.14
  • Numpy version: 2.1.1
  • PyTorch version (GPU?): 2.4.1+cu121 (False)
  • PyTorch XPU available: False
  • PyTorch NPU available: False
  • PyTorch MLU available: False
  • PyTorch MUSA available: False
  • System RAM: 377.69 GB
  • Accelerate default config:
    Not found

Reproduction

# !/bin/bash

conda activate trl
cd trl-test/
pip install huggingface-hub accelerate
huggingface-cli login --token hf_xxx
git clone https://github.com/huggingface/trl.git

#### Multi GPU
yes "y" | ACCELERATE_LOG_LEVEL=info accelerate launch  \
  --config_file ./accelerate_configs/multi_gpu.yaml \
  trl/examples/scripts/sft.py \
  --model_name_or_path inceptionai/Jais-family-256m \
  --trust_remote_code \
  --dataset_name AbderrahmanSkiredj1/ArQuAD_train14k_test_1k6 \
  --dataset_text_field context \
  --output_dir ./jais256m-sft-ArQuAD \
  --load_in_8bit true \
  --use_peft true \
  --lora_r 16 \
  --lora_alpha 32 \
  --lora_dropout 0.05 \
  --lora_target_modules "all-linear" \
  --lora_task_type 'CAUSAL_LM' \
  --learning_rate 3e-5 \
  --per_device_train_batch_size 1 \
  --gradient_accumulation_steps 4

Gives the following error :

  warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")
cuBLAS API failed with status 15
error detected/nfs_users/users/ali.filali/miniconda3/envs/trl/lib/python3.10/site-packages/bitsandbytes/autograd/_functions.py:316: UserWarning: MatMul8bitLt: inputs will be cast from torch.float32 to float16 during quantization
  warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")
A: torch.Size([263, 1088]), B: torch.Size([3264, 1088]), C: (263, 3264); (lda, ldb, ldc): (c_int(8416), c_int(104448), c_int(8416)); (m, n, k): (c_int(263), c_int(3264), c_int(1088))
cuBLAS API failed with status 15
error detectedcuBLAS API failed with status 15
error detectedA: torch.Size([195, 1088]), B: torch.Size([3264, 1088]), C: (195, 3264); (lda, ldb, ldc): (c_int(6240), c_int(104448), c_int(6240)); (m, n, k): (c_int(195), c_int(3264), c_int(1088))
A: torch.Size([216, 1088]), B: torch.Size([3264, 1088]), C: (216, 3264); (lda, ldb, ldc): (c_int(6912), c_int(104448), c_int(6912)); (m, n, k): (c_int(216), c_int(3264), c_int(1088))
cuBLAS API failed with status 15
error detectedA: torch.Size([125, 1088]), B: torch.Size([3264, 1088]), C: (125, 3264); (lda, ldb, ldc): (c_int(4000), c_int(104448), c_int(4000)); (m, n, k): (c_int(125), c_int(3264), c_int(1088))
[rank3]: Traceback (most recent call last):

Expected behavior

I expect the training to start and finish in about 5 minutes similar to what happen when i run the following code with no --load_in_8bit true flag :

yes "y" | ACCELERATE_LOG_LEVEL=info accelerate launch  \
  --config_file ./accelerate_configs/multi_gpu.yaml \
  trl/examples/scripts/sft.py \
  --model_name_or_path inceptionai/Jais-family-256m \
  --trust_remote_code \
  --dataset_name AbderrahmanSkiredj1/ArQuAD_train14k_test_1k6 \
  --dataset_text_field context \
  --output_dir ./jais256m-sft-ArQuAD \
  --use_peft true \
  --lora_r 16 \
  --lora_alpha 32 \
  --lora_dropout 0.05 \
  --lora_target_modules "all-linear" \
  --lora_task_type 'CAUSAL_LM' \
  --learning_rate 3e-5 \
  --per_device_train_batch_size 1 \
  --gradient_accumulation_steps 4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant