Goto

Collaborating Authors

 interpolation


scaleKernelMatrix

Neural Information Processing Systems

Kernel matrix-vector multiplication (KMVM) is one of the most important operations needed in scientific computing with core applications indiffeomorphic registration, geometric learning [11], [31],numerical analysis [28],fluid dynamics [6],and machine learning [27].


Spectral-Transport Stability and Benign Overfitting in Interpolating Learning

Fredriksson-Imanov, Gustav Olaf Yunus Laitinen-Lundström

arXiv.org Machine Learning

We develop a theoretical framework for generalization in the interpolating regime of statistical learning. The central question is why highly overparameterized estimators can attain zero empirical risk while still achieving nontrivial predictive accuracy, and how to characterize the boundary between benign and destructive overfitting. We introduce a spectral-transport stability framework in which excess risk is controlled jointly by the spectral geometry of the data distribution, the sensitivity of the learning rule under single-sample replacement, and the alignment structure of label noise. This leads to a scale-dependent Fredriksson index that combines effective dimension, transport stability, and noise alignment into a single complexity parameter for interpolating estimators. We prove finite-sample risk bounds, establish a sharp benign-overfitting criterion through the vanishing of the index along admissible spectral scales, and derive explicit phase-transition rates under polynomial spectral decay. For a model-specific specialization, we obtain an explicit theorem for polynomial-spectrum linear interpolation, together with a proof of the resulting rate. The framework also clarifies implicit regularization by showing how optimization dynamics can select interpolating solutions of minimal spectral-transport energy. These results connect algorithmic stability, double descent, benign overfitting, operator-theoretic learning theory, and implicit bias within a unified structural account of modern interpolation.


Kriging via variably scaled kernels

Audone, Gianluca, Marchetti, Francesco, Perracchione, Emma, Rossini, Milvia

arXiv.org Machine Learning

Classical Gaussian processes and Kriging models are commonly based on stationary kernels, whereby correlations between observations depend exclusively on the relative distance between scattered data. While this assumption ensures analytical tractability, it limits the ability of Gaussian processes to represent heterogeneous correlation structures. In this work, we investigate variably scaled kernels as an effective tool for constructing non-stationary Gaussian processes by explicitly modifying the correlation structure of the data. Through a scaling function, variably scaled kernels alter the correlations between data and enable the modeling of targets exhibiting abrupt changes or discontinuities. We analyse the resulting predictive uncertainty via the variably scaled kernel power function and clarify the relationship between variably scaled kernels-based constructions and classical non-stationary kernels. Numerical experiments demonstrate that variably scaled kernels-based Gaussian processes yield improved reconstruction accuracy and provide uncertainty estimates that reflect the underlying structure of the data


Natural Value Approximators: Learning when to Trust Past Estimates

Neural Information Processing Systems

Neural networks have a smooth initial inductive bias, such that small changes in input do not lead to large changes in output. However, in reinforcement learning domains with sparse rewards, value functions have non-smooth structure with a characteristic asymmetric discontinuity whenever rewards arrive. We propose a mechanism that learns an interpolation between a direct value estimate and a projected value estimate computed from the encountered reward and the previous estimate. This reduces the need to learn about discontinuities, and thus improves the value function approximation. Furthermore, as the interpolation is learned and state-dependent, our method can deal with heterogeneous observability. We demonstrate that this one change leads to significant improvements on multiple Atari games, when applied to the state-of-the-art A3C algorithm.



MCVD: MaskedConditionalVideoDiffusionfor Prediction,Generation,and Interpolation

Neural Information Processing Systems

Wecanseethatthisisenough time fortwodifferent painted arrows to pass under the car. If one zooms in, one can inspect the relative positions of the arrow and the Mercedes hood ornament in the real versus predicted frames.



OntheSimilaritybetweentheLaplace andNeuralTangentKernels

Neural Information Processing Systems

Finally, we provide experiments on real data comparing NTK and the Laplace kernel, along with a larger class ofγ-exponential kernels. We show that these perform almost identically.


A New Neural Kernel Regime: The Inductive Bias of Multi-Task Learning

Neural Information Processing Systems

Remarkably, the solutions learned for each individual task resemble those obtained by solving a kernel regression problem, revealing a novel connection between neural networks and kernel methods.