Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression

Open in new window