Goto

Collaborating Authors

 8-bit training


Scalable methods for 8-bit training of neural networks

Neural Information Processing Systems

Quantized Neural Networks (QNNs) are often used to improve network efficiency during the inference phase, i.e. after the network has been trained. Extensive research in the field suggests many different quantization schemes. Still, the number of bits required, as well as the best quantization scheme, are yet unknown. Our theoretical analysis suggests that most of the training process is robust to substantial precision reduction, and points to only a few specific operations that require higher precision.


Reviews: Scalable methods for 8-bit training of neural networks

Neural Information Processing Systems

This is interesting given the fact that most of the existing works are based on 16 bit and people are having some difficulties in training 8bit models. The paper identified that the training difficulty comes from batchnorm and it proposed a variant of batchnorm called range batchnorm which alleviate the numerical instability of the original batchnorm occurring with the quantized models. By such simple modification, the paper shows that a 8 bit model can be easily trained using GEMMLOWP, an existing framework. The paper also tried to analyze and understand the proposed approach in a theoretical manner. Experiments well supported the argument of the paper.


Scalable methods for 8-bit training of neural networks

Banner, Ron, Hubara, Itay, Hoffer, Elad, Soudry, Daniel

Neural Information Processing Systems

Quantized Neural Networks (QNNs) are often used to improve network efficiency during the inference phase, i.e. after the network has been trained. Extensive research in the field suggests many different quantization schemes. Still, the number of bits required, as well as the best quantization scheme, are yet unknown. Our theoretical analysis suggests that most of the training process is robust to substantial precision reduction, and points to only a few specific operations that require higher precision. Additionally, as QNNs require batch-normalization to be trained at high precision, we introduce Range Batch-Normalization (BN) which has significantly higher tolerance to quantization noise and improved computational complexity.