You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(1) Llama 3.2 1B config with DoRA errors on state dict load. Repro:
tune run lora_finetune_single_device --config llama3_2/1B_lora_single_device \
gradient_accumulation_steps=1 max_steps_per_epoch=5 model.use_dora=True
...
Exception: Error converting the state dict. Found unexpected key: "layers.0.attn.q_proj.magnitude". Please make sure you're loading a checkpoint with the right format.
(2) Llama 3.2 Vision 11B model with DoRA has NaN loss. Repro:
tune run lora_finetune_single_device --config llama3_2_vision/11B_lora_single_device \
max_steps_per_epoch=5 gradient_accumulation_steps=1 model.use_dora=True
Once we fix them we should add recipe test cases setting model.use_dora=True to catch these errors in the future, cc @felipemello1.
The text was updated successfully, but these errors were encountered:
Two separate DoRA bugs I just noticed:
(1) Llama 3.2 1B config with DoRA errors on state dict load. Repro:
(2) Llama 3.2 Vision 11B model with DoRA has NaN loss. Repro:
Once we fix them we should add recipe test cases setting
model.use_dora=True
to catch these errors in the future, cc @felipemello1.The text was updated successfully, but these errors were encountered: