hybrid 8-bit
Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks
Reducing the numerical precision of data and computation is extremely effective in accelerating deep learning training workloads. Towards this end, 8-bit floating point representations (FP8) were recently proposed for DNN training. However, its applicability was demonstrated on a few selected models only and significant degradation is observed when popular networks such as MobileNet and Transformer are trained using FP8. This degradation is due to the inherent precision requirement difference in the forward and backward passes of DNN training. Using theoretical insights, we propose a hybrid FP8 (HFP8) format and DNN end-to-end distributed training procedure. We demonstrate, using HFP8, the successful training of deep learning models across a whole spectrum of applications including Image Classification, Object Detection, Language and Speech without accuracy degradation. Finally, we demonstrate that, by using the new 8 bit format, we can directly quantize a pre-trained model down to 8-bits without losing accuracy by simply fine-tuning batch normalization statistics. These novel techniques enable a new generations of 8-bit hardware that are robust for building and deploying neural network models.
Reviews: Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks
Originality - This basically amounts to using two different floating point formats - one for forward, and one for backward. Or another way to think about it is that we are allowing more freedom in the mantissa/exponent divide for floating point. That's a good observation to have, theoretically, but how would a framework implement this, practically? For example, maybe I missed it, but I don't see how you convert between 1-4-3 and 1-5-2 formats when you prepare for back prop if we were to productize this. Do the frameworks now have to support 2 more data types?
Reviews: Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks
This paper demonstrates a method for low precision 8-bit deep neural network training without loss of performance across a number of popular tasks, including classification, speech recognition, and translation. The reviewers agree that this is a strong paper with solid analysis and a number of novel contributions to low precision training.
Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks
Reducing the numerical precision of data and computation is extremely effective in accelerating deep learning training workloads. Towards this end, 8-bit floating point representations (FP8) were recently proposed for DNN training. However, its applicability was demonstrated on a few selected models only and significant degradation is observed when popular networks such as MobileNet and Transformer are trained using FP8. This degradation is due to the inherent precision requirement difference in the forward and backward passes of DNN training. Using theoretical insights, we propose a hybrid FP8 (HFP8) format and DNN end-to-end distributed training procedure.
Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks
Sun, Xiao, Choi, Jungwook, Chen, Chia-Yu, Wang, Naigang, Venkataramani, Swagath, Srinivasan, Vijayalakshmi (Viji), Cui, Xiaodong, Zhang, Wei, Gopalakrishnan, Kailash
Reducing the numerical precision of data and computation is extremely effective in accelerating deep learning training workloads. Towards this end, 8-bit floating point representations (FP8) were recently proposed for DNN training. However, its applicability was demonstrated on a few selected models only and significant degradation is observed when popular networks such as MobileNet and Transformer are trained using FP8. This degradation is due to the inherent precision requirement difference in the forward and backward passes of DNN training. Using theoretical insights, we propose a hybrid FP8 (HFP8) format and DNN end-to-end distributed training procedure.