[Bug]: KV Cache exploded #91

rakkit · 2024-12-14T06:16:46Z

Describe the bug

In the case of using softmax attention or any other attention with window_size=None, the KV cache update falls into this branch. This logic concatenates all historical sequence states with the new states (attn_state[0] and attn_state[1]), causing exponential growth in the KV cache.

Steps to reproduce the bug

Inference with attention with window_size=None

Expected behavior

KV-cache exploded

Environment info

The text was updated successfully, but these errors were encountered:

yzhangcs · 2024-12-16T17:19:39Z

@rakkit Hello, could you provide a minimal script for reproduction. I didn't meet the errors you reported by setting window_size=None

rakkit · 2025-01-06T19:58:39Z

Hi, I tried to reproduce this for a while, but I cannot see this problem anymore. I am not sure if i messed up something there. sorry about that.

sustcsonglin · 2025-01-06T20:59:33Z

@rakkit Great to know!

rakkit · 2025-01-07T00:26:14Z

Actually, the Cache explicitly requires cache_kwargs to include the key window_size, here:

window_size = cache_kwargs.get('window_size', None)

However, cache_kwargs is set to None by default, as defined here:

cache_kwargs: Optional[Dict[str, Any]] = None

In the FLA attention implementation, cache_kwargs is passed as a dictionary for cache updates to avoid,:

cache_kwargs = dict(window_size=self.window_size)

though, in some cases where users are unaware of this requirement in cache_kwargs will get an error

yzhangcs · 2025-01-14T13:21:28Z

Thank you I will check it later. Are you willing to make some PRs?

rakkit · 2025-01-14T13:47:36Z

The easiest solution here

if cache_kwargs is not Nont:
    window_size = cache_kwargs.get('window_size', None)
else:
    window_size = None

or

cache_kwargs = {'window_size':None} if cache_kwargs is None else cache_kwargs

rakkit added the bug Something isn't working label Dec 14, 2024

rakkit closed this as completed Jan 6, 2025

rakkit reopened this Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: KV Cache exploded #91

[Bug]: KV Cache exploded #91

rakkit commented Dec 14, 2024

yzhangcs commented Dec 16, 2024

rakkit commented Jan 6, 2025

sustcsonglin commented Jan 6, 2025

rakkit commented Jan 7, 2025

yzhangcs commented Jan 14, 2025

rakkit commented Jan 14, 2025

[Bug]: KV Cache exploded #91

[Bug]: KV Cache exploded #91

Comments

rakkit commented Dec 14, 2024

Describe the bug

Steps to reproduce the bug

Expected behavior

Environment info

yzhangcs commented Dec 16, 2024

rakkit commented Jan 6, 2025

sustcsonglin commented Jan 6, 2025

rakkit commented Jan 7, 2025

yzhangcs commented Jan 14, 2025

rakkit commented Jan 14, 2025