FP8 Quantization: The Power of the Exponent Andrey Kuzmin, Mart V an Baalen

Neural Information Processing Systems 

Neural network quantization is one of the most effective ways to improve the efficiency of neural networks. Quantization allows weights and activations to be represented in low bit-width formats, e.g. 8 bit integers (INT8).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found