SGD: The Role of Implicit Regularization, Batch-size and Multiple Epochs
–Neural Information Processing Systems
Our main contributions are threefold: 1. We show that for any regularizer, there is an SCO problem for which Regularized Empirical Risk Minimzation fails to learn.
Neural Information Processing Systems
Aug-18-2025, 07:45:43 GMT