A Closer Look at the Generalization Gap in Large Batch Training of Neural Networks

Sep-16-2020, 21:50:17 GMT–#artificialintelligence

Deep learning architectures such as recurrent neural networks and convolutional neural networks have seen many significant improvements and have been applied in the fields of computer vision, speech recognition, natural language processing, audio recognition and more. The most commonly used optimization method for training highly complex and non-convex DNNs is stochastic gradient descent (SGD) or some variant of it. DNNs however typically have some non-convex objective functions which are a bit difficult optimize with SGD. Thus, SGD, at best, finds a local minimum of this objective function. Although the solutions of DNNs are a local minima, they have produced great end results.

artificial intelligence, batch size, machine learning, (16 more...)

#artificialintelligence

Sep-16-2020, 21:50:17 GMT

News Web Page

Add feedback

AI-Alerts:
- 2020 > 2020-09 > AAAI AI-Alert for Sep 22, 2020 (1.00)

Country:
- Asia > Middle East > Israel (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found