Layer-wise Adaptive Step-Sizes for Stochastic First-Order Methods for Deep Learning

Open in new window