Overview
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models Xiuying Wei
Therefore, transformer quantization attracts wide research interest. Recent work recognizes that structured outliers are the critical bottleneck for quantization performance. However, their proposed methods increase the computation overhead and still leave the outliers there. To fundamentally address this problem, this paper delves into the inherent inducement and importance of the outliers.
3295c76acbf4caaed33c36b1b5fc2cb1-Reviews.html
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper presents an approach to exploit the local similarity structure for zero-shot or a few-shot problem. The idea is to not only use mid-level representation such as attributes which help in zero-shot problem but also ensure that the unlabeled data is labeled such that similar images receive similar label. Overall, I like the direction of the paper. I think exploiting graph structure is an interesting idea which hasn't been looked into the zero-shot problem (as far as I know).