DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs

Neural Information Processing Systems 

Quantization of large language models (LLMs) faces significant challenges, particularly due to the presence of outlier activations that impede efficient low-bit representation.