Reviews: Implicit Regularization of Discrete Gradient Dynamics in Linear Neural Networks
–Neural Information Processing Systems
The paper studies the dynamics of discrete gradient descent for overparametrized two-layer neural networks and shows that under certain conditions on the input/output covariance matrices and the initialization the components of the input-output map are learned sequentially. The reviewers appreciated the contributions of the paper, both theory and experiments, and found the paper well written. At the same time, one reviewer feels the assumptions are too strong, and another one feels that some claims are misleading (e.g. Post rebuttal, the reviewer concluded that the novelty of the paper is buried in the appendix, and that a re-write of the paper is needed to elucidate that novelty in the body of the paper. This AC agrees with R4 that the contributions relative to Lampinen and Ganguli need to be clearly established in the body of the paper and that a citation needs to be added.
Neural Information Processing Systems
Feb-5-2025, 09:18:48 GMT
- Technology: