Reviews: On Lazy Training in Differentiable Programming
–Neural Information Processing Systems
The paper provided some interesting understanding, but is not significant enough to explain interesting issues in deep learning. The paper showed that lazy training can be caused by parameter scaling, not special to overparameterization of neural networks. What does this tell us about the overparameterized neural networks? Does this result imply that lazy regime of overparameterized neural networks is necessarily due to parameter scaling? If not, lazy regime of overparameterized neural networks cannot be explained simply by parameter scaling.
Neural Information Processing Systems
Jan-26-2025, 11:14:55 GMT
- Technology: