AITopics | interpolation

The population dynamics of molecules, cells, and organisms are governed by a number of unknown forces. In the last decade, population dynamics have predominantly been modeled with Wasserstein gradient flows. However, since gradient flows minimize free energy, they fail to capture important dynamical properties, such as periodicity. In this work, we propose a change in perspective by considering dynamics that minimize a population-level action under a damped Wasserstein Lagrangian. By deriving the corresponding Hamiltonian equations of motion, we formalize Wasserstein Lagrangian Mechanics, a structured class of second-order dynamics that encompasses classical mechanics, quantum mechanics, and gradient flows. We then propose WLM as the first algorithm that learns these second-order dynamics from observed marginals, without specifying the Lagrangian. By directly learning the population mechanics, WLM can both forecast and interpolate unseen marginals, and outperforms existing gradient flow and flow matching methods across a wide range of dynamics, including vortex dynamics, embryonic development, and flocking.

artificial intelligence, machine learning, mechanics, (16 more...)

arXiv.org Machine Learning

2605.0855

Country:

North America > Canada (0.67)
North America > United States (0.46)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

49ad23d1ec9fa4bd8d77d02681df5cfa-Supplemental.pdf

Neural Information Processing SystemsMay-1-2026, 02:25:11 GMT

artificial intelligence, imagenet 128 128, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Null Measurability at the Symmetrization Interface in VC Learning

Gupta, Dhruv

arXiv.org Machine LearningApr-29-2026

Recent work revisiting measurability in the fundamental theorem of statistical learning imposes Borel measurability of ghost-gap suprema. We show that, at the one-sided ghost-gap interface actually used by the standard symmetrization proof, this requirement is stronger than necessary. For any Borel-parameterized concept class on a Polish domain, the bad event "there exists a hypothesis whose ghost empirical error exceeds its training empirical error by at least ε/2" is analytic. By Choquet capacitability, it is therefore measurable in the completion of every finite Borel measure. We then construct a concept class whose bad event is null-measurable but not Borel, giving a strict separation from the Borel supremum condition. Finally, we prove closure under patching, fixed and countable interpolation, and fiber-product amalgamation, showing that the weaker regularity level is stable under natural concept-class constructors. In the realizable setting, where targets belong to the class and are measurable, these results weaken the measurability hypothesis needed by the symmetrization route from finite VC dimension to PAC learnability. The main results and the descriptive-set-theoretic infrastructure used by them are formalized in Lean 4.

artificial intelligence, bad event, machine learning, (17 more...)

arXiv.org Machine Learning

2604.25028

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)

Add feedback

Learning Curves and Benign Overfitting of Spectral Algorithms in Large Dimensions

Lu, Weihao, Lin, Qian, Xia, Yingcun, Huang, Dongming

arXiv.org Machine LearningApr-28-2026

Existing large-dimensional theory for spectral algorithms resolves either the optimally tuned point or the interpolation limit, but leaves the under-regularized regime unexplored. We study the learning curve and benign overfitting of spectral algorithms in the largedimensional setting where the sample size and dimension are of comparable order, i.e., n dγ for some γ > 0. We first consider inner-product kernels on the sphere Sd 1 and establish a sharp asymptotic characterization of the excess risk across the full regularization path under various source conditions s 0, where smeasures the relative smoothness of the regression function. Our results reveal that the learning curve is not simply U-shaped but instead consists of three distinct regimes: over-regularized, under-regularized, and interpolation regimes. This characterization allows us to fully capture the benign overfitting phenomenon, demonstrating that benign overfitting arises consistently across both the under-regularized and interpolation regimes whenever sis positive but no larger than a critical threshold. We further show that, in the sufficiently regularized regime, the kernel learning curve is recovered by an associated sequence model. Finally, we extend the learning-curve analysis to large-dimensional KRR for a class of kernels on general domains in Rd whose low-degree eigenspaces satisfy spectral-scaling and hyper-contractivity conditions. Keywords: Spectral algorithms, learning curves, high dimension, benign overfitting. 1 Introduction Nonparametric regression studies the estimation of an unknown function f: Rd R from ni.i.d.

artificial intelligence, learning curve, machine learning, (18 more...)

arXiv.org Machine Learning

2604.23212

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback

Dynamics of SGD with Stochastic Polyak Stepsizes: Truly Adaptive Variants and Convergence to Exact Solution

Neural Information Processing SystemsApr-27-2026, 03:01:31 GMT

The proposed SPS comes with strong convergence guarantees and competitive performance; however, it has two main drawbacks when it is used in non-over-parameterized regimes: (i) It requires a priori knowledge of the optimal mini-batch losses, which are not available when the interpolation condition is not satisfied (e.g., regularized objectives), and (ii) it guarantees convergence only to a neighborhood of the solution. In this work, we study the dynamics and the convergence properties of SGD equipped with new variants of the stochastic Polyak stepsize and provide solutions to both drawbacks of the original SPS. We first show that a simple modification of the original SPS that uses lower bounds instead of the optimal function values can directly solve issue (i). On the other hand, solving issue (ii) turns out to be more challenging and leads us to valuable insights into the method's behavior. We show that if interpolation is not satisfied, the correlation between SPS and stochastic gradients introduces a bias, which effectively distorts the expectation of the gradient signal near minimizers, leading to non-convergence - even if the stepsize is scaled down during training. To fix this issue, we propose DecSPS, a novel modification of SPS, which guarantees convergence to the exact minimizer - without a priori knowledge of the problem parameters. For strongly-convex optimization problems, DecSPS is the first stochastic adaptive optimization method that converges to the exact solution without restrictive assumptions like bounded iterates/gradients.

artificial intelligence, decsps, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

2d025936bae21d2c2d4cc74779aa77c7-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-26-2026, 10:51:03 GMT

artificial intelligence, machine learning, protein, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
Asia > China (0.14)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

5812f92450ccaf17275500841c70924a-Paper.pdf

Neural Information Processing SystemsApr-26-2026, 00:44:01 GMT

communication, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

497476fe61816251905e8baafdf54c23-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 17:47:15 GMT

architecture, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

scaleKernelMatrix

Neural Information Processing SystemsApr-25-2026, 14:22:12 GMT

Kernel matrix-vector multiplication (KMVM) is one of the most important operations needed in scientific computing with core applications indiffeomorphic registration, geometric learning [11], [31],numerical analysis [28],fluid dynamics [6],and machine learning [27].

artificial intelligence, interaction, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback