Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models Xiuying Wei
–Neural Information Processing Systems
Therefore, transformer quantization attracts wide research interest. Recent work recognizes that structured outliers are the critical bottleneck for quantization performance. However, their proposed methods increase the computation overhead and still leave the outliers there. To fundamentally address this problem, this paper delves into the inherent inducement and importance of the outliers.
Neural Information Processing Systems
Oct-9-2025, 15:58:45 GMT