A bug when run run_chunkllama_100k with flash decoding #23

lwj2001 · 2024-07-19T02:46:05Z

When i use Llama3 (with flash decoding ) to run run_chunkllama_100k, and it can successful start. But when i input prompt. then encounter a TypeError that:

File "ChunkLlama/flash_decoding_chunkllama.py", line 140, in new_flash_attn_with_kvcache
    out, softmax_lse = flash_attn_cuda.fwd_kvcache(
TypeError: fwd_kvcache(): incompatible function arguments. The following argument types are supported:
    1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: torch.Tensor, arg3: Optional[torch.Tensor], arg4: Optional[torch.Tensor], arg5: Optional[torch.Tensor], arg6: Optional[torch.Tensor], arg7: Optional[torch.Tensor], arg8: Optional[torch.Tensor], arg9: Optional[torch.Tensor], arg10: Optional[torch.Tensor], arg11: Optional[torch.Tensor], arg12: Optional[torch.Tensor], 
    arg13: float, arg14: bool, arg15: int, arg16: int, arg17: float, arg18: bool, arg19: int) -> list[torch.Tensor]
Invoked with: tensor(xxx) , ...... , None, None, None, None, None, None, 0.08838834764831845, False, -1, -1, True, 0

emmm, this is hard to me...

The text was updated successfully, but these errors were encountered:

ChenxinAn-fdu · 2024-07-21T13:32:04Z

I have tested the code again and it works. python run_chunkllama_100k.py
Please make sure you have flash_attn >=2.5.3, transformers==4.37.2 and attn_implementation="flash_attention_2",

hsiehjackson · 2024-08-14T22:05:31Z

I think the error is because we should install flash_attn >=2.5.3,<2.6.0

ChenxinAn-fdu · 2024-08-19T06:55:45Z

Thank you !!! I have updated this in Readme

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A bug when run run_chunkllama_100k with flash decoding #23

A bug when run run_chunkllama_100k with flash decoding #23

lwj2001 commented Jul 19, 2024 •

edited

Loading

ChenxinAn-fdu commented Jul 21, 2024 •

edited

Loading

hsiehjackson commented Aug 14, 2024

ChenxinAn-fdu commented Aug 19, 2024

A bug when run run_chunkllama_100k with flash decoding #23

A bug when run run_chunkllama_100k with flash decoding #23

Comments

lwj2001 commented Jul 19, 2024 • edited Loading

ChenxinAn-fdu commented Jul 21, 2024 • edited Loading

hsiehjackson commented Aug 14, 2024

ChenxinAn-fdu commented Aug 19, 2024

lwj2001 commented Jul 19, 2024 •

edited

Loading

ChenxinAn-fdu commented Jul 21, 2024 •

edited

Loading