Phase diagram of early training dynamics in deep networks: effect of the learning rate, depth, and width

Neural Information Processing Systems 

We systematically analyze optimization dynamics in deep neural networks (DNNs) trained with stochastic gradient descent (SGD) and study the effect of learning rate η, depth d, and width w of the neural network.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found