Update model implementations to use flash attention #2046

divyashreepathihalli · 2025-01-16T23:27:19Z

Flash attention support has been added to Keras 3.
https://github.com/keras-team/keras/blob/25d6d80a6ecd31f0da52c325cd16dbe4a29b7329/keras/src/layers/attention/multi_head_attention.py#L55

However, some of the models implemented in KerasHub is overriding def _compute_attention() function which has the flash attention enabling mechanism. The implementations need to be updated

The text was updated successfully, but these errors were encountered:

divyashreepathihalli added the team-created Issues created by Keras Hub team as part of development roadmap. label Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update model implementations to use flash attention #2046

Update model implementations to use flash attention #2046

divyashreepathihalli commented Jan 16, 2025

Update model implementations to use flash attention #2046

Update model implementations to use flash attention #2046

Comments

divyashreepathihalli commented Jan 16, 2025