AITopics | formula

We connect stochastic resetting from non-equilibrium statistical physics with ridge regularization in statistical learning. For linear gradient flow, resetting to the origin at rate $r$ produces stationary mean $(X^\top X+rI)^{-1}X^\top y$, exactly the ridge estimator with penalty $λ=r$. This uses the known Laplace-transform relationship between ridge regression and exponential-time averaging of gradient flow, with the exponential time now interpreted as the stationary age associated with Poisson resetting. We then extend this identity to general renewal reset laws: the exponential reset time distribution is the unique renewal law whose stationary mean reproduces scalar ridge in every eigendirection as an exact filter identity for every positive curvature, while non-exponential renewal laws generate alternative spectral filters. At the fluctuation level, we study a separate additive Ornstein-Uhlenbeck extension with constant diffusion, interpreted as a stylized SGD approximation. In this setting, the equality holds only at the level of the mean, since the reset process has a nonzero stationary covariance from accumulated OU noise and reset-timing variance, whereas deterministic ridge is a fixed estimator with the same center. Stylized experiments compare the deterministic renewal-induced filters directly and illustrate when filters induced by non-exponential reset-time laws can differ predictively from ridge. The results for the stationary mean and the induced spectral filters are established for continuous-time gradient flow with isotropic resetting on quadratic objectives; the covariance and risk formulas additionally assume additive noise with state-independent covariance.

artificial intelligence, machine learning, noise, (19 more...)

arXiv.org Machine Learning

2605.30059

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)

Add feedback

On the Sample Complexity of Robust Binary Hypothesis Testing

Vallinayagam, Shankar, Pensia, Ankit, Jog, Varun

arXiv.org Machine LearningMay-26-2026

We study the sample complexity of robust binary hypothesis testing under three standard contamination models: $\varepsilon$-additive (Huber), $\varepsilon$-subtractive, and $\varepsilon$-total variation (TV), denoted by $n^*_{\mathrm{Hub}}(\varepsilon)$, $n^*_{\mathrm{Sub}}(\varepsilon)$, and $n^*_{\mathrm{TV}}(\varepsilon)$, respectively. For subtractive contamination, we show that least favourable distributions exist and provide explicit formulas for the same, bringing this model in line with the classical Huber and TV models. Next we show that in all three models, sample complexity may be highly unstable in the contamination parameter $\varepsilon$, increasing by polynomial factors even for $o(\varepsilon)$ perturbations. Similarly, there may be polynomial factor gaps between the sample complexities when $\varepsilon$ is known exactly versus when it is known up to $o(\varepsilon)$ error. Despite the instability of the sample complexity in all models, we show that the sample complexities across models are comparable up to constant-factor rescaling of $\varepsilon$. Specifically, for any fixed $δ_0>0$, the following hold for all distributions $p$ and $q$: (i) $n^*_{\mathrm{Hub}}(\varepsilon) \lesssim n^*_{\mathrm{TV}}(\varepsilon) \lesssim n^*_{\mathrm{Hub}}(2\varepsilon)$, (ii) $n^*_{\mathrm{Sub}}(\varepsilon) \lesssim n^*_{\mathrm{TV}}(\varepsilon) \lesssim n^*_{\mathrm{Sub}}((2+δ_0)\varepsilon)$, and (iii) $n^*_{\mathrm{Sub}}(\varepsilon) \lesssim n^*_{\mathrm{Hub}}(\varepsilon) \lesssim n^*_{\mathrm{Sub}}((1+δ_0)\varepsilon)$, and the scaling constants are tight. Finally, we extend our results to adaptive versions of the contamination models.

artificial intelligence, contamination, scientific discovery, (16 more...)

arXiv.org Machine Learning

2605.24741

Country: North America > United States (0.45)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.62)

Add feedback

It's the Great Fear of Our Time. I'm Mathematically Sure It Won't Happen.

SlateMay-25-2026, 09:45:00 GMT

The individual pieces create a kind of illusion. When a horse trots, is there a moment when its four feet are in the air simultaneously? In the 1870s, Leland Stanford, the railroad magnate and benefactor of the university that bears his name, funded an effort to find out. The answer shocked many equestrian experts and artists: The horse's feet leave the ground together, but not when outstretched as commonly depicted in paintings and carousels; the feet do so when they reach inward, toward the horse's belly. Surprisingly, this discovery about a horse's gait sheds light on a much more modern debate--whether A.I. is on a path to consciousness.

artificial intelligence, machine learning, natural language, (15 more...)

Slate

Country: North America > United States (0.29)

Industry:

Marketing (1.00)
Media (0.71)

Technology:

Information Technology > Communications > Social Media (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.37)

Add feedback

Calibeating for general proper losses: A Bregman divergence approach

Fichtl, Maximilian, Guzmán, Cristóbal, Mehta, Nishant A.

arXiv.org Machine LearningMay-19-2026

This work introduces a general framework for calibeating based on regret minimization. As compared to Foster and Hart's seminal calibeating work which had specialized treatments of Brier score (squared loss) and log loss, we consider a large family of proper losses that includes $α$-Tsallis losses (for $α\in [1, 2]$) and Lipschitz losses. Our results for Tsallis losses also hold for an unscaled version of Tsallis loss that recovers log loss. Our analysis is oriented around the Bregman divergence view of a proper loss. Technically, our results for the family of Tsallis losses that we consider are U-calibration results, simultaneously obtaining logarithmic regret for all losses in this family while having a weaker dependence on the dimension compared to previous results. Of potential independent interest, we also show a new regret equality for the regret of Be The Regularized Leader. This regret equality holds for general proper losses and itself is based on two results related to online updating formulas for the generalized variance, the latter being a previously introduced generalization of variance based on Bregman divergences.

artificial intelligence, machine learning, proper loss, (16 more...)

arXiv.org Machine Learning

2605.17269

Genre: Research Report (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Geometry-Aware Residual Correction of Hagan's SABR Implied Volatility Formula

Reghai, Adil, Tarsissi, Lama, Biau, Gérard, Lipton, Alex

arXiv.org Machine LearningMay-8-2026

This paper proposes a hybrid methodology to improve the approximation of SABR (Stochastic Alpha Beta Rho) implied volatility by combining analytical structure with machine learning. The approach augments the neural-network input representation with geometric features derived from the stochastic differential equations of the SABR model. Unlike approaches that fully replace analytical formulas with black-box models, the proposed framework preserves the analytical backbone of the model. The hybridization operates along two complementary dimensions. First, geometry-aware variables reflecting intrinsic properties of the SABR dynamics are used as structured inputs to the network. Second, the neural network is trained to learn the residual error relative to Hagan's closed-form approximation rather than implied volatility directly. The resulting model acts as a structured residual correction to the analytical formula, retaining interpretability while capturing higher-order effects that are not included in the asymptotic expansion. Numerical experiments conducted over realistic parameter domains, as well as stressed environments, show that the method improves accuracy and robustness compared with both analytical approximations and standard neural-network approaches. Because the correction remains lightweight and structurally consistent with the underlying model, the framework is well suited for real-time pricing and calibration in practical trading environments.

artificial intelligence, machine learning, volatility, (18 more...)

arXiv.org Machine Learning

2605.06604

Country: Europe (0.46)

Genre: Research Report (0.82)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

DeepMath - Deep Sequence Models for Premise Selection

Geoffrey Irving, Christian Szegedy, Alexander A. Alemi, Niklas Een, Francois Chollet, Josef Urban

Neural Information Processing SystemsApr-30-2026, 23:08:19 GMT

We study the effectiveness of neural sequence models for premise selection in automated theorem proving, one of the main bottlenecks in the formalization of mathematics. We propose a two stage approach for this task that yields good results for the premise selection task on the Mizar corpus while avoiding the handengineered features of existing state-of-the-art models. To our knowledge, this is the first time deep learning has been applied to theorem proving on a large scale.

conjecture, logic & formal reasoning, machine learning, (19 more...)

Neural Information Processing Systems

Country: Europe (0.68)

Genre:

Instructional Material (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

fe1ab2f77a9a0f224839cc9f1034a908-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 10:24:16 GMT

Add feedback

fce176458ff542940fa3ed16e6f9c852-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 09:55:34 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Sports (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

f5ccb3ab757131a93586ef61ec701533-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 08:09:14 GMT

In this section, we compare the symmetric solutions found in erf [2] and ReLU networks [5] to our one-neuron solution (n =1). The main difference is that both earlier studies constrain the search space to the symmetric subspace whereas we first prove that the non-trivial critical points are contained in this subspace in Theorem 5.1 for a broad class of activation functions, including erf and ReLU. Solving the low-dimensional loss, we recover the same solution for ReLU and erf as in [2, 5] for unit-orthonormal teachers.

artificial intelligence, critical point, machine learning, (19 more...)

Neural Information Processing Systems

Technology: