Reviews: Scalable methods for 8-bit training of neural networks

Oct-8-2024, 08:27:43 GMT–Neural Information Processing Systems

This is interesting given the fact that most of the existing works are based on 16 bit and people are having some difficulties in training 8bit models. The paper identified that the training difficulty comes from batchnorm and it proposed a variant of batchnorm called range batchnorm which alleviate the numerical instability of the original batchnorm occurring with the quantized models. By such simple modification, the paper shows that a 8 bit model can be easily trained using GEMMLOWP, an existing framework. The paper also tried to analyze and understand the proposed approach in a theoretical manner. Experiments well supported the argument of the paper.

8-bit training, neural network, scalable method, (7 more...)

Neural Information Processing Systems

Oct-8-2024, 08:27:43 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.42)