Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoModelForCausalLM supports llama models now #6

Open
passaglia opened this issue May 31, 2023 · 1 comment
Open

AutoModelForCausalLM supports llama models now #6

passaglia opened this issue May 31, 2023 · 1 comment

Comments

@passaglia
Copy link

passaglia commented May 31, 2023

In newer versions of the transformer library, AutoModelForCausalLM can properly identify llama models.

There's therefore no need anymore for the LlamaModel class. Llama models run with --model_name causal.

The only hiccup I experienced was an error about the generate function receiving token_type_ids. I fixed this by adding to my tokenizer_config.json the lines

"model_input_names": [
    "input_ids",
    "attention_mask"
  ], 

This could be addressed within flan-eval by setting return_token_type_ids=False in CausalModel's call to the tokenizer

self.tokenizer(prompt, return_tensors="pt", return_token_type_ids=False)
@soujanyaporia
Copy link
Contributor

soujanyaporia commented May 31, 2023

Thanks! Could you please do a PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants