AITopics | eigenvalue decay

We present a theoretically grounded Gaussian process framework that leverages neural feature maps to construct expressive kernels. We show that the learned feature map can be interpreted as an optimal low-rank approximation to a Gram matrix derived from an implied RKHS, from which we establish consistency of the GP posterior. We further analyse the spectral properties of the induced kernels and introduce product feature-map kernels to address oversmoothing. This simple yet powerful approach enables fast, scalable, and accurate exact GP inference with minimal upfront work. The flexibility of kernel design supports seamless application to both regression and classification tasks across diverse data modalities, including tabular inputs and structured domains such as images.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

arXiv.org Machine Learning

2605.10285

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Modeling & Simulation (0.86)
(2 more...)

Add feedback

Optimal Learning Rates for Regularized Conditional Mean Embedding

Neural Information Processing SystemsApr-24-2026, 23:09:16 GMT

We address the consistency of a kernel ridge regression estimate of the conditional mean embedding (CME), which is an embedding of the conditional distribution of Y given X into a target reproducing kernel Hilbert space HY . The CME allows us to take conditional expectations of target RKHS functions, and has been employed in nonparametric causal and Bayesian inference. We address the misspecified setting, where the target CME is in the space of Hilbert-Schmidt operators acting from an input interpolation space between HX and L2, to HY . This space of operators is shown to be isomorphic to a newly defined vector-valued interpolation space. Using this isomorphism, we derive a novel and adaptive statistical learning rate for the empirical CME estimator under the misspecified setting. Our analysis reveals that our rates match the optimal O(logn/n) rates without assuming HY to be finite dimensional. We further establish a lower bound on the learning rate, which shows that the obtained upper bound is optimal.

artificial intelligence, bayesian inference, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks

Neural Information Processing SystemsMar-17-2026, 14:05:24 GMT

We consider the problem of learning function classes computed by neural networks with various activations (e.g. ReLU or Sigmoid), a task believed to be computationally intractable in the worst-case. A major open problem is to understand the minimal assumptions under which these classes admit provably efficient algorithms. In this work we show that a natural distributional assumption corresponding to {\em eigenvalue decay} of the Gram matrix yields polynomial-time algorithms in the non-realizable setting for expressive classes of networks (e.g.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.44)

Add feedback

Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks

Neural Information Processing SystemsNov-21-2025, 15:13:11 GMT

We consider the problem of learning function classes computed by neural networks with various activations (e.g. ReLU or Sigmoid), a task believed to be computationally intractable in the worst-case. A major open problem is to understand the minimal assumptions under which these classes admit provably efficient algorithms. In this work we show that a natural distributional assumption corresponding to {\em eigenvalue decay} of the Gram matrix yields polynomial-time algorithms in the non-realizable setting for expressive classes of networks (e.g.

distributional assumption, eigenvalue decay, polynomial-time learnability, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.44)

Add feedback

Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks

Surbhi Goel, Adam Klivans

Neural Information Processing SystemsNov-21-2025, 08:52:41 GMT

We consider the problem of learning function classes computed by neural networks with various activations (e.g. ReLU or Sigmoid), a task believed to be computationally intractable in the worst-case. A major open problem is to understand the minimal assumptions under which these classes admit provably efficient algorithms. In this work we show that a natural distributional assumption corresponding to eigenvalue decay of the Gram matrix yields polynomial-time algorithms in the non-realizable setting for expressive classes of networks (e.g.

artificial intelligence, eigenvalue decay, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
(4 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

We study pure exploration in bandits, where the dimension of the feature representation can be much larger than the number of arms.

algorithm, exploration, sample complexity, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Fast Spectrum Estimation of Some Kernel Matrices

Lepilov, Mikhail

arXiv.org Machine LearningNov-1-2024

In data science, individual observations are often assumed to come independently from an underlying probability space. Kernel matrices formed from large sets of such observations arise frequently, for example during classification tasks. It is desirable to know the eigenvalue decay properties of these matrices without explicitly forming them, such as when determining if a low-rank approximation is feasible. In this work, we introduce a new eigenvalue quantile estimation framework for some kernel matrices. This framework gives meaningful bounds for all the eigenvalues of a kernel matrix while avoiding the cost of constructing the full matrix. The kernel matrices under consideration come from a kernel with quick decay away from the diagonal applied to uniformly-distributed sets of points in Euclidean space of any dimension. We prove the efficacy of this framework given certain bounds on the kernel function, and we provide empirical evidence for its accuracy. In the process, we also prove a very general interlacing-type theorem for finite sets of numbers. Additionally, we indicate an application of this framework to the study of the intrinsic dimension of data, as well as several other directions in which to generalize this work.

artificial intelligence, eigenvalue, machine learning, (18 more...)

arXiv.org Machine Learning

2411.00657

Country: