Optimal Rates of Kernel Ridge Regression under Source Condition in Large Dimensions

Zhang, Haobo, Li, Yicheng, Lu, Weihao, Lin, Qian

Jan-2-2024–arXiv.org Artificial Intelligence

Motivated by the studies of neural networks (e.g.,the neural tangent kernel theory), we perform a study on the large-dimensional behavior of kernel ridge regression (KRR) where the sample size $n \asymp d^{\gamma}$ for some $\gamma > 0$. Given an RKHS $\mathcal{H}$ associated with an inner product kernel defined on the sphere $\mathbb{S}^{d}$, we suppose that the true function $f_{\rho}^{*} \in [\mathcal{H}]^{s}$, the interpolation space of $\mathcal{H}$ with source condition $s>0$. We first determined the exact order (both upper and lower bound) of the generalization error of kernel ridge regression for the optimally chosen regularization parameter $\lambda$. We then further showed that when $01$, KRR is not minimax optimal (a.k.a. he saturation effect). Our results illustrate that the curves of rate varying along $\gamma$ exhibit the periodic plateau behavior and the multiple descent behavior and show how the curves evolve with $s>0$. Interestingly, our work provides a unified viewpoint of several recent works on kernel regression in the large-dimensional setting, which correspond to $s=0$ and $s=1$ respectively.

convergence rate, inner product kernel, kernel ridge regression, (12 more...)

arXiv.org Artificial Intelligence

Jan-2-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts (0.04)
  - New York > New York County
    - New York City (0.14)
- Asia > China
  - Beijing > Beijing (0.04)

Genre:
- Research Report > New Finding (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (0.92)
  - Performance Analysis > Accuracy (0.91)