Tight conditions for when the NTK approximation is valid

Nov-5-2023–arXiv.org Artificial Intelligence

We study when the neural tangent kernel (NTK) approximation is valid for training a model with the square loss. In the lazy training setting of Chizat et al. 2019, we show that rescaling the model by a factor of $\alpha = O(T)$ suffices for the NTK approximation to be valid until training time $T$. Our bound is tight and improves on the previous bound of Chizat et al. 2019, which required a larger rescaling factor of $\alpha = O(T^2)$.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

Nov-5-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found