AutoModelForCausalLM supports llama models now #6

passaglia · 2023-05-31T08:16:02Z

In newer versions of the transformer library, AutoModelForCausalLM can properly identify llama models.

There's therefore no need anymore for the LlamaModel class. Llama models run with --model_name causal.

The only hiccup I experienced was an error about the generate function receiving token_type_ids. I fixed this by adding to my tokenizer_config.json the lines

"model_input_names": [
    "input_ids",
    "attention_mask"
  ],

This could be addressed within flan-eval by setting return_token_type_ids=False in CausalModel's call to the tokenizer

self.tokenizer(prompt, return_tensors="pt", return_token_type_ids=False)

The text was updated successfully, but these errors were encountered:

soujanyaporia · 2023-05-31T08:36:24Z

Thanks! Could you please do a PR?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AutoModelForCausalLM supports llama models now #6

AutoModelForCausalLM supports llama models now #6

passaglia commented May 31, 2023 •

edited

Loading

soujanyaporia commented May 31, 2023 •

edited

Loading

AutoModelForCausalLM supports llama models now #6

AutoModelForCausalLM supports llama models now #6

Comments

passaglia commented May 31, 2023 • edited Loading

soujanyaporia commented May 31, 2023 • edited Loading

passaglia commented May 31, 2023 •

edited

Loading

soujanyaporia commented May 31, 2023 •

edited

Loading