Adam-family Methods with Decoupled Weight Decay in Deep Learning

Open in new window