Mixed Precision Training - Baidu Research

Feb-11-2018, 23:31:47 GMT–#artificialintelligence

Figure 2: Mixed precision training for deep learning models. Secondly, we introduce a technique called loss-scaling that allows us to recover some of the small valued gradients. During training, some weight gradients have very small exponents that become zero in FP16 format. To overcome this problem, we scale the loss using a scaling factor at the start of back-propagation. Through the chain-rule, the gradients are also scaled up and become representable in FP16.

baidu research, gradient, precision training, (8 more...)

#artificialintelligence

Feb-11-2018, 23:31:47 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found