Scalable methods for 8-bit training of neural networks

Ron Banner, Itay Hubara, Elad Hoffer, Daniel Soudry

Oct-8-2024, 08:27:41 GMT–Neural Information Processing Systems

Quantized Neural Networks (QNNs) are often used to improve network efficiency during the inference phase, i.e. after the network has been trained. Extensive research in the field suggests many different quantization schemes. Still, the number of bits required, as well as the best quantization scheme, are yet unknown. Our theoretical analysis suggests that most of the training process is robust to substantial precision reduction, and points to only a few specific operations that require higher precision.

artificial intelligence, machine learning, quantization, (18 more...)

Neural Information Processing Systems

Oct-8-2024, 08:27:41 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)