模型预测问题 #96

Linxuxin · 2023-08-28T06:34:15Z

按照步骤使用p-tuning v2微调chatglm2-6b，微调之后切换v0.1，并在predict_pt中使用微调模型路径，运行后报如下错误：
raise RuntimeError(f"Error(s) in loading state_dict for {model.class.name}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for ChatGLMForConditionalGeneration:
size mismatch for transformer.prefix_encoder.embedding.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([16, 4096]).
size mismatch for transformer.prefix_encoder.trans.0.weight: copying a param with shape torch.Size([4096, 14336]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for transformer.prefix_encoder.trans.2.weight: copying a param with shape torch.Size([14336, 4096]) from checkpoint, the shape in current model is torch.Size([229376, 4096]).
size mismatch for transformer.prefix_encoder.trans.2.bias: copying a param with shape torch.Size([14336]) from checkpoint, the shape in current model is torch.Size([229376]).
You may consider adding ignore_mismatched_sizes=True in the model from_pretrained method.

Process finished with exit code 1

希望作者能更新一下预测步骤。

The text was updated successfully, but these errors were encountered:

liucongg · 2023-09-07T06:54:21Z

后面我更新一下预测代码吧，感觉不少人需要

micrazy · 2023-09-07T08:16:34Z

可以看我的PR #99

xzdong-2019 · 2023-09-13T06:18:01Z

from transformers import AutoConfig, AutoModel, AutoTokenizer
import torch

载入Tokenizer

tokenizer = AutoTokenizer.from_pretrained("/sdc/pre_trained_model/chatglm2-6b/", trust_remote_code=True)
config = AutoConfig.from_pretrained("/sdc/pre_trained_model/chatglm2-6b/", trust_remote_code=True, pre_seq_len=16)
model = AutoModel.from_pretrained("/sdc/pre_trained_model/chatglm2-6b/", config=config, trust_remote_code=True)

#加载pt2 finetune模型
CHECKPOINT_PATH ="./output-glm/epoch-2-step-4654/pytorch_model-00002-of-00002.bin"
prefix_state_dict = torch.load(CHECKPOINT_PATH)
prefix_state_dict_v1 = torch.load("./output-glm/epoch-2-step-4654/pytorch_model-00001-of-00002.bin")
prefix_state_dict.update(prefix_state_dict_v1)

for key in ["transformer.prefix_encoder.trans.0.weight", "transformer.prefix_encoder.trans.0.bias", "transformer.prefix_encoder.trans.2.weight", "transformer.prefix_encoder.trans.2.bias"]:
prefix_state_dict.pop(key)
model.load_state_dict(prefix_state_dict, strict=True)

我用这个方法加载成功了，但是我不知道为啥"transformer.prefix_encoder.trans.0.weight这些需要重参数中移除，这样是否正确，能否提供一下你在使用deepspeed训练后的预测文件吗

sunzhe09 · 2023-10-13T03:07:16Z

我也比较关注，目前ft完了之后不知道怎么实际测试推理效果

sunzhe09 · 2023-10-13T06:45:05Z

可以看我的PR #99

我试了一下，跑的时候找不到pytorch_model这个文件，哪里可以找到这个文件？我生成的都是pytorch_model-00001-of-00002.bin之类的

micrazy · 2023-10-13T06:49:31Z

可以看我的PR #99

我试了一下，跑的时候找不到pytorch_model这个文件，哪里可以找到这个文件？我生成的都是pytorch_model-00001-of-00002.bin之类的

我改了train.py里模型的存储和载入方式，只存储prefix部分的权重，chatglm本身的权重可以拿原始的权重

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

模型预测问题 #96

模型预测问题 #96

Linxuxin commented Aug 28, 2023

liucongg commented Sep 7, 2023

micrazy commented Sep 7, 2023

xzdong-2019 commented Sep 13, 2023

sunzhe09 commented Oct 13, 2023

sunzhe09 commented Oct 13, 2023

micrazy commented Oct 13, 2023

模型预测问题 #96

模型预测问题 #96

Comments

Linxuxin commented Aug 28, 2023

liucongg commented Sep 7, 2023

micrazy commented Sep 7, 2023

xzdong-2019 commented Sep 13, 2023

载入Tokenizer

sunzhe09 commented Oct 13, 2023

sunzhe09 commented Oct 13, 2023

micrazy commented Oct 13, 2023