You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After the training, the following appeared in my checkpoint directory, but I could not merge the adapter weights into the model because of an error saying that the size is different from the current model.
I tried using the 'merge_lora_weight.py' and got the following error message.
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.model.embed_tokens.modules_to_save.default.weight: copying a param with shape torch.Size([128261, 4096]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).
size mismatch for base_model.model.lm_head.modules_to_save.default.weight: copying a param with shape torch.Size([128261, 4096]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).
Could you tell me what I am doing wrong?
Thank you for your time!
The text was updated successfully, but these errors were encountered:
Hi @heesuju, I think the reason might be: you didn't pass the argument: --prompt_template_version v2.llama3
Because the base model is: meetkai/functionary-small-v2.5, which is using prompt template version: v2.llama3 while the default prompt template is: v2 - a mismatch.
The v2.llama3 doesn't add new tokens while v2 adds new tokens --> Shape mismatch when you merged. So please run the training again with adding: --prompt_template_version v2.llama3
Hello,
I have been trying to fine-tune functionary-2.5-small with my own custom dataset according to the provided test format.
I only have 24 GB of VRAM available, so I trained with deepspeed using stage 3 cpu offloading.
The training params are as follows:
The following is the ds_config I used:
After the training, the following appeared in my checkpoint directory, but I could not merge the adapter weights into the model because of an error saying that the size is different from the current model.
I tried using the 'merge_lora_weight.py' and got the following error message.
Could you tell me what I am doing wrong?
Thank you for your time!
The text was updated successfully, but these errors were encountered: