Reviews: Training Deep Neural Networks with 8-bit Floating Point Numbers

Oct-7-2024, 08:01:21 GMT–Neural Information Processing Systems

The main goal of this work is to lower the precision of training with deep neural networks to better use new styles of hardware. In particular, the technical idea is to lower the size of the accumulator in dot products and the representation of all weights. The main observation is that a form of chunking, which is well known in the optimization and math programming community, has sufficiently better error properties to train DNNs. The paper is clear, simple, and effective. The title is a bit misleading, since it really is a mixed precision paper as a result of many of the numbers actually being FP16--not 8.

contribution, point number, training deep neural network, (5 more...)

Neural Information Processing Systems

Oct-7-2024, 08:01:21 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)