Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models Xiuying Wei

Neural Information Processing Systems 

Therefore, transformer quantization attracts wide research interest. Recent work recognizes that structured outliers are the critical bottleneck for quantization performance. However, their proposed methods increase the computation overhead and still leave the outliers there. To fundamentally address this problem, this paper delves into the inherent inducement and importance of the outliers.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found