How Parallelization and Large Batch Size Improve the Performance of Deep Neural Networks.

Nov-13-2021, 04:25:39 GMT–#artificialintelligence

Large Batch Size had till recently been viewed as a deterrent for good accuracy. However recent studies show that increasing the batch size can significantly reduce the training time while maintaining a considerable level of accuracy. In this blog, we draw on our inferences from four such technical papers. The RMSprop Warm-up phase is used to address the optimization difficulty at the start of the training. The update rule demonstrated below utilizes both the Stochastic Gradient Descent (SGD) along the RMSprop optimization algorithm.

accuracy, data parallelization, input pipeline, (13 more...)

#artificialintelligence

Nov-13-2021, 04:25:39 GMT

News Web Page

Add feedback

Genre:
- Research Report (0.71)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.68)
  - Statistical Learning > Gradient Descent (0.56)