SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

Open in new window