You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
which corresponds to this line. However, q has size b*t*h*d after the rearange operation and thus q.shape[2] is the number of heads rather than the number of tokens. offset=q.shape[1] seems reasonable.
Steps to Reproduce the Bug
importflafromtransformersimportAutoModelForCausalLM, AutoTokenizername='fla-hub/gla-1.3B-100B'tokenizer=AutoTokenizer.from_pretrained(name)
model=AutoModelForCausalLM.from_pretrained(name).cuda()
input_prompt="Power goes with permanence. Impermanence is impotence. And rotation is castration."input_ids=tokenizer(input_prompt, return_tensors="pt").input_ids.cuda()
outputs=model.generate(input_ids, max_length=64)
Just print q.shape[2].
Expected Behavior
q.shape[2] = 4
Environment Information
Same as that in README.
The text was updated successfully, but these errors were encountered:
Describe the Bug
The offset argument in the following code determines how many new tokens are generated in the current step:
which corresponds to this line. However,
q
has sizeb*t*h*d
after the rearange operation and thusq.shape[2]
is the number of heads rather than the number of tokens.offset=q.shape[1]
seems reasonable.Steps to Reproduce the Bug
Just print q.shape[2].
Expected Behavior
q.shape[2] = 4
Environment Information
Same as that in README.
The text was updated successfully, but these errors were encountered: