Optimal Rates of Kernel Ridge Regression under Source Condition in Large Dimensions
Zhang, Haobo, Li, Yicheng, Lu, Weihao, Lin, Qian
–arXiv.org Artificial Intelligence
Motivated by the studies of neural networks (e.g.,the neural tangent kernel theory), we perform a study on the large-dimensional behavior of kernel ridge regression (KRR) where the sample size $n \asymp d^{\gamma}$ for some $\gamma > 0$. Given an RKHS $\mathcal{H}$ associated with an inner product kernel defined on the sphere $\mathbb{S}^{d}$, we suppose that the true function $f_{\rho}^{*} \in [\mathcal{H}]^{s}$, the interpolation space of $\mathcal{H}$ with source condition $s>0$. We first determined the exact order (both upper and lower bound) of the generalization error of kernel ridge regression for the optimally chosen regularization parameter $\lambda$. We then further showed that when $01$, KRR is not minimax optimal (a.k.a. he saturation effect). Our results illustrate that the curves of rate varying along $\gamma$ exhibit the periodic plateau behavior and the multiple descent behavior and show how the curves evolve with $s>0$. Interestingly, our work provides a unified viewpoint of several recent works on kernel regression in the large-dimensional setting, which correspond to $s=0$ and $s=1$ respectively.
arXiv.org Artificial Intelligence
Jan-2-2024
- Country:
- Asia > China
- North America > United States
- Massachusetts (0.04)
- New York > New York County
- New York City (0.14)
- Genre:
- Research Report > New Finding (0.34)
- Technology: