FlexiCache: Leveraging Temporal Stability of Attention Heads for Efficient KV Cache Management

Open in new window