2ac79356a03fe5e9250e5e77ebc76e6e-Paper-Conference.pdf

Neural Information Processing Systems 

Our result shed light on the difference between Adam and (stochastic) gradient descent from a theoretical perspective.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found