Extreme Compression of Large Language Models via Additive Quantization

Open in new window