Review for NeurIPS paper: On the linearity of large non-linear models: when and why the tangent kernel is constant
–Neural Information Processing Systems
This paper clarify the condition under which the NTK remains constant. First, it is pointed out that the NTK is constant if and only if the model is linear. Second, it is shown that the NTK is almost constant if the spectral norm of the Hessian is small. The Hessian norm is bounded by some conditions: linearity of output, sparse dependence of activation function, and no-bottleneck layers. Overall, this paper is well written.
Neural Information Processing Systems
Feb-4-2025, 17:02:30 GMT
- Technology: