LAVa: Layer-wise KV Cache Eviction with Dynamic Budget Allocation

Open in new window