ae614c557843b1df326cb29c57225459-Paper.pdf
–Neural Information Processing Systems
In this work, we showthat this "lazy training" phenomenon isnot specific tooverparameterized neural networks, and is due to a choice of scaling, often implicit, that makes the model behave as its linearization around the initialization, thus yielding amodel equivalenttolearning withpositive-definite kernels.
Neural Information Processing Systems
Feb-13-2026, 14:36:03 GMT