[Bug] GatedLinearAttention got NaN #122

980202006 · 2025-01-17T06:09:01Z

Describe the Bug

Forward output nan, GatedLinearAttention

Steps to Reproduce the Bug

        o, recurrent_state = fused_chunk_gla(
            q=q,
            k=k,
            v=v,
            g=gk,
            initial_state=recurrent_state,
            output_final_state=use_cache,
            head_first=False
        )
        if torch.isnan(o).any():
                1==1

head = 1
seq_len=43884
dim=64

Expected Behavior

There is no nan in the forward output

Environment Information

Torch:'2.5.1+cu124'
Triton:3.0 nightly

The text was updated successfully, but these errors were encountered:

980202006 added the bug Something isn't working label Jan 17, 2025

980202006 changed the title ~~[Bug]~~ [Bug] GatedLinearAttention got NaN Jan 17, 2025

yzhangcs self-assigned this Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] GatedLinearAttention got NaN #122

[Bug] GatedLinearAttention got NaN #122

980202006 commented Jan 17, 2025

[Bug] GatedLinearAttention got NaN #122

[Bug] GatedLinearAttention got NaN #122

Comments

980202006 commented Jan 17, 2025

Describe the Bug

Steps to Reproduce the Bug

Expected Behavior

Environment Information