AITopics | inner product kernel

Collaborating Authors

inner product kernel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Pinsker bound of inner product kernel regression in large dimensions

Lu, Weihao, Ding, Jialin, Zhang, Haobo, Lin, Qian

arXiv.org Machine LearningSep-1-2024

This intriguing phenomenon, where the two asymptotics are equal, was rigorously justified by the seminal work on Le Cam equivalence. These work established the asymptotic equivalence between Gaussian sequence models, the white noise model, and certain nonparametric regression models (see, e.g., [3, 4, 5]). Since then, subsequent studies have established similar exact risks for a variety of nonparametric estimation problems. These include density estimation, regression models with non-Gaussian noise or random designs, analysis of Besov bodies, and wavelet estimation (e.g., [6, 7, 8, 2, 9, 10, 11, 12, 13]). For a detailed review of these developments, one can refer to [14] and the references therein. Constants akin to β(m, R), now often referred to as the Pinsker constant, play an indispensable role in studying the super-efficiency phenomenon observed in nonparametric problems. This phenomenon has been the subject of extensive investigation (e.g., [15, 16, 17, 18]). Recently, the strong theoretical links between the training dynamics within wide neural networks and the corresponding neural tangent kernel in regression have motivated substantial research into understanding the performance of spectral algorithms, such as kernel ridge regression and kernel gradient descent, in the context of kernel regression problems (see, e.g., [19, 20, 21, 22, 23, 24]).

kernel regression, lemma 4, regression, (13 more...)

arXiv.org Machine Learning

2409.00915

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

The phase diagram of kernel interpolation in large dimensions

Zhang, Haobo, Lu, Weihao, Lin, Qian

arXiv.org Machine LearningApr-18-2024

The generalization ability of kernel interpolation in large dimensions (i.e., $n \asymp d^{\gamma}$ for some $\gamma>0$) might be one of the most interesting problems in the recent renaissance of kernel regression, since it may help us understand the 'benign overfitting phenomenon' reported in the neural networks literature. Focusing on the inner product kernel on the sphere, we fully characterized the exact order of both the variance and bias of large-dimensional kernel interpolation under various source conditions $s\geq 0$. Consequently, we obtained the $(s,\gamma)$-phase diagram of large-dimensional kernel interpolation, i.e., we determined the regions in $(s,\gamma)$-plane where the kernel interpolation is minimax optimal, sub-optimal and inconsistent.

inner product kernel, kernel interpolation, lemma 5, (9 more...)

arXiv.org Machine Learning

2404.12597

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback

Optimal Rates of Kernel Ridge Regression under Source Condition in Large Dimensions

Zhang, Haobo, Li, Yicheng, Lu, Weihao, Lin, Qian

arXiv.org Artificial IntelligenceJan-2-2024

Motivated by the studies of neural networks (e.g.,the neural tangent kernel theory), we perform a study on the large-dimensional behavior of kernel ridge regression (KRR) where the sample size $n \asymp d^{\gamma}$ for some $\gamma > 0$. Given an RKHS $\mathcal{H}$ associated with an inner product kernel defined on the sphere $\mathbb{S}^{d}$, we suppose that the true function $f_{\rho}^{*} \in [\mathcal{H}]^{s}$, the interpolation space of $\mathcal{H}$ with source condition $s>0$. We first determined the exact order (both upper and lower bound) of the generalization error of kernel ridge regression for the optimally chosen regularization parameter $\lambda$. We then further showed that when $01$, KRR is not minimax optimal (a.k.a. he saturation effect). Our results illustrate that the curves of rate varying along $\gamma$ exhibit the periodic plateau behavior and the multiple descent behavior and show how the curves evolve with $s>0$. Interestingly, our work provides a unified viewpoint of several recent works on kernel regression in the large-dimensional setting, which correspond to $s=0$ and $s=1$ respectively.

convergence rate, inner product kernel, kernel ridge regression, (12 more...)

arXiv.org Artificial Intelligence

2401.0127

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Massachusetts (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.91)

Add feedback

Optimal Rate of Kernel Regression in Large Dimensions

Lu, Weihao, Zhang, Haobo, Li, Yicheng, Xu, Manyun, Lin, Qian

arXiv.org Machine LearningSep-8-2023

We perform a study on kernel regression for large-dimensional data (where the sample size $n$ is polynomially depending on the dimension $d$ of the samples, i.e., $n\asymp d^{\gamma}$ for some $\gamma >0$ ). We first build a general tool to characterize the upper bound and the minimax lower bound of kernel regression for large dimensional data through the Mendelson complexity $\varepsilon_{n}^{2}$ and the metric entropy $\bar{\varepsilon}_{n}^{2}$ respectively. When the target function falls into the RKHS associated with a (general) inner product model defined on $\mathbb{S}^{d}$, we utilize the new tool to show that the minimax rate of the excess risk of kernel regression is $n^{-1/2}$ when $n\asymp d^{\gamma}$ for $\gamma =2, 4, 6, 8, \cdots$. We then further determine the optimal rate of the excess risk of kernel regression for all the $\gamma>0$ and find that the curve of optimal rate varying along $\gamma$ exhibits several new phenomena including the {\it multiple descent behavior} and the {\it periodic plateau behavior}. As an application, For the neural tangent kernel (NTK), we also provide a similar explicit description of the curve of optimal rate. As a direct corollary, we know these claims hold for wide neural networks as well.

artificial intelligence, machine learning, regression, (15 more...)

arXiv.org Machine Learning

2309.04268

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Beijing > Beijing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

How rotational invariance of common kernels prevents generalization in high dimensions

Donhauser, Konstantin, Wu, Mingqi, Yang, Fanny

arXiv.org Machine LearningApr-9-2021

Kernel ridge regression is well-known to achieve minimax optimal rates in low-dimensional settings. However, its behavior in high dimensions is much less understood. Recent work establishes consistency for kernel regression under certain assumptions on the ground truth function and the distribution of the input data. In this paper, we show that the rotational invariance property of commonly studied kernels (such as RBF, inner product kernels and fully-connected NTK of any depth) induces a bias towards low-degree polynomials in high dimensions. Our result implies a lower bound on the generalization error for a wide range of distributions and various choices of the scaling for kernels with different eigenvalue decays. This lower bound suggests that general consistency results for kernel ridge regression in high dimensions require a more refined analysis that depends on the structure of the kernel beyond its eigenvalue decay.

assumption, kernel, theorem 3, (13 more...)

arXiv.org Machine Learning

2104.04244

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.92)
Health & Medicine > Therapeutic Area > Hematology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.54)

Add feedback

New Probabilistic Bounds on Eigenvalues and Eigenvectors of Random Kernel Matrices

Reyhani, Nima, Hino, Hideitsu, Vigario, Ricardo

arXiv.org Machine LearningFeb-14-2012

Kernel methods are successful approaches for different machine learning problems. This success is mainly rooted in using feature maps and kernel matrices. Some methods rely on the eigenvalues/eigenvectors of the kernel matrix, while for other methods the spectral information can be used to estimate the excess risk. An important question remains on how close the sample eigenvalues/eigenvectors are to the population values. In this paper, we improve earlier results on concentration bounds for eigenvalues of general kernel matrices. Meanwhile, the obstacles for sharper bounds are accounted for and partially addressed. As a case study, we derive a concentration inequality for sample kernel target-alignment. 1 INTRODUCTION Kernel methods such as Spectral Clustering, Kernel Principal Component Analysis(KPCA), and Support Vector Machines, are successful approaches in many practical machine learning and data analysis problems (Steinwart & Christmann, 2008). The main ingredient of these methods is the kernel matrix, which is built using the kernel function, evaluated at given sample points.

artificial intelligence, eigenvalue, machine learning, (17 more...)

arXiv.org Machine Learning

1202.3761

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback