Explaining Landscape Connectivity of Low-cost Solutions for Multilayer Nets

Kuditipudi, Rohith, Wang, Xiang, Lee, Holden, Zhang, Yi, Li, Zhiyuan, Hu, Wei, Arora, Sanjeev, Ge, Rong

Jun-14-2019–arXiv.org Machine Learning

Efforts to understand how and why deep learning works have led to a focus on the optimization landscape of training loss. Since optimization to near-zero training loss occurs for many choices of random initialization, it is clear that the landscape contains many global optima (or near-optima). However, the loss can become quite high when interpolating between found optima, suggesting that these optima occur at the bottom of "valleys" surrounded on all sides by high walls. Therefore the phenomenon of mode connectivity (Garipov et al., 2018; Draxler et al., 2018) came as a surprise: optima (at least the ones discovered by gradient-based optimization) are connected by simple paths in the parameter space, on which the loss function is almost constant. In other words, the optima are not walled off in separate valleys as hitherto believed.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

Jun-14-2019

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found