Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing Y elysei Bondarenko, Markus Nagel, Tijmen Blankevoort Qualcomm AI Research Amsterdam, The Netherlands

Neural Information Processing Systems 

Due to their size, the capability of these networks has increased tremendously, but this has come at the cost of a significant increase in necessary compute.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found