Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods

Neural Information Processing Systems 

We list each algorithm's effective stepsize at iteration

Similar Docs  Excel Report  more

TitleSimilaritySource
None found