WeightedKV: Attention Scores Weighted Key-Value Cache Merging for Large Language Models

Open in new window