Reviews: Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence
–Neural Information Processing Systems
The paper is well-written and the authors are clear about their claims. The idea of critical periods during training with reference to regularization is interesting. If true, this would give a different way to think about generalization. The authors have performed a number of experiments with different configurations. Although, there are deficiencies mentioned below.
Neural Information Processing Systems
Jan-25-2025, 09:21:03 GMT