AITopics

2605.1234

Country: Asia (0.28)

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.67)
Education > Educational Setting > Online (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.41)

Saha, Anan, Ganguly, Arnab

Learning stochastic multiscale models through normalizing flows

arXiv.org Machine LearningMay-12-2026

Many systems in physics, engineering, and biology exhibit multiscale stochastic dynamics, where low-dimensional slow variables evolve under the influence of high-dimensional fast processes. In practice, observations are often limited to a single trajectory of the slow component, while the fast dynamics remain unobserved, making statistical learning challenging. Approaches based on partial differential equations (PDE), such as Fokker-Planck formulations, aim to characterize the evolution of probability densities, typically requiring dense space-time data or grid-based solvers. In contrast, we adopt a trajectory-based perspective and develop a data-driven framework for learning effective stochastic dynamics from a single observed path. We model the dynamics by coupled multiscale stochastic differential equations (SDEs) and first obtain a principled model reduction through stochastic averaging. Unlike generic model reduction techniques such as PCA, this respects the dynamical structure of the original system and explicitly incorporates the interaction between slow and fast scales. A central challenge, however, is that the reduced model depends on the invariant distribution of the fast process, which is a solution to an intractable and often unknown PDE. We introduce a novel learning framework that parameterizes the invariant distribution using normalizing flows, enabling expressive density modeling in the latent fast-variable space. The flow is trained end-to-end by optimizing a penalized likelihood objective induced by the reduced stochastic dynamics. Furthermore, we develop a Bayesian variational inference procedure for uncertainty quantification, employing a second normalizing flow to approximate the posterior distribution over model parameters. This yields a scalable approach to capturing epistemic uncertainty in multiscale systems.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2605.09718

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Kratsios, Anastasis, Cousins, Gregory, Borde, Haitz Sáez de Ocáriz, Kim, Bum Jun, Brugiapaglia, Simone

Every Feedforward Neural Network Definable in an o-Minimal Structure Has Finite Sample Complexity

arXiv.org Machine LearningMay-11-2026

We show that, in a precise sense, a broad class of feedforward neural networks learn (have finite sample complexity) in the PAC model: every fixed finite feedforward architecture whose layers are definable in an o-minimal structure has finite sample complexity in the agnostic PAC setting, even with unbounded parameters. This covers standard fixed-size MLPs, CNNs, GNNs, and transformers with fixed sequence length, together with the operations and layers typically used in such architectures, including linear projections, residual connections, attention mechanisms, pooling layers, normalization layers, and admissible positional encodings. Hence, distribution-free learnability for modern non-recurrent architectures is not an exceptional property of particular activations or architecture-specific VC arguments, but a consequence of tame feedforward computation. Our results reposition finite-sample PAC learnability as a baseline rather than a differentiator: they shift the focus of architectural comparison toward inductive biases, symmetries and geometric priors, scalability, and optimization behaviour.

artificial intelligence, def, machine learning, (19 more...)

2605.07097

Country:

North America > Canada (0.46)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Asia > Japan > Honshū (0.28)
North America > United States (0.28)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Kratsios, Anastasis, Neuman, A. Martina, Petersen, Philipp

Adaptivity Under Realizability Constraints: Comparing In-Context and Agentic Learning

arXiv.org Machine LearningMay-7-2026

We compare in-context learning with fixed queries and agentic learning with adaptive queries for uniform approximation of task families. We consider two settings: an unrestricted regime, where querying and approximation are arbitrary functions, and a realizable regime, where we require these operations to be implemented by ReLU neural networks. In both settings, adaptivity never hinders approximation performance. However, this advantage can change when one passes from the unrestricted regime to the realizable regime. We identify four distinct approximation scenarios, each witnessed by an explicit task family: (a) no advantage of adaptivity; (b) an advantage in the unrestricted regime that persists under ReLU realizability; (c) an advantage that arises only under realizability; and (d) an advantage that disappears under realizability. This demonstrates that representational constraints interact profoundly with the effect of adaptivity.

learner, machine learning, natural language, (19 more...)

2605.04995

Country:

Europe > Austria (0.28)
North America > Canada (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Machine LearningMay-6-2026

Adaptive Confidence Intervals in Efron's Gaussian Two-Groups Model

Wang, Qiaosen, Chai, Shuwen, Gao, Chao

Robust uncertainty quantification is increasingly important in modern data analysis and is often formalized under Huber's model, which allows an $\varepsilon$-fraction of arbitrary corruptions. In many experimental sciences, however, the measurement protocol is well controlled, and contamination is more plausibly introduced upstream. Motivated by this noise-oblivious nature of adversaries, we study confidence intervals for the null location parameter $θ$ in Efron's Gaussian two-groups model, where an unknown fraction $\varepsilon$ of observations have arbitrarily shifted means, but all samples share the same law of additive Gaussian measurement noise with variance $σ^2$. We characterize the minimax-optimal length among confidence intervals with a prescribed coverage level uniformly over the unknown contamination proportion and all noise-oblivious adversaries. Although prior work has shown that the minimax point estimation rate of theta does not deteriorate when $\varepsilon$ becomes unknown, our results reveal that, with a given $σ^2$, the minimax-optimal length of confidence intervals that are adaptive to unknown $\varepsilon$ is of order $σ(n^{-1/4}+\varepsilon^{1/2}/\max\{1, \log(en \varepsilon^2)\}^{1/2})$, which is polynomially worse than the optimal length when $\varepsilon$ is known. When the variance $σ^2$ is also unknown, we show a further degradation: no adaptive confidence interval can be shorter than $Ω(σn^{-1/8})$. Algorithmically, we introduce a Fourier-based certification procedure built on Carathéodory's positive-semidefiniteness constraints. By scanning candidate points and accepting those whose residual characteristic function is certifiably consistent with a Gaussian location mixture, our algorithm attains the minimax lower bound in the known-variance setting and is computable in polynomial time.

artificial intelligence, characteristic function, confidence interval, (17 more...)

2604.26992

Country: North America > United States (0.45)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.45)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Neural Information Processing SystemsMay-1-2026, 01:42:19 GMT

Integral Probability Metrics PAC-Bayes Bounds

We present a PAC-Bayes-style generalization bound which enables the replacement of the KL-divergence with a variety of Integral Probability Metrics (IPM). We provide instances of this bound with the IPM being the total variation metric and the Wasserstein distance. A notable feature of the obtained bounds is that they naturally interpolate between classical uniform convergence bounds in the worst case (when the prior and posterior are far away from each other), and improved bounds in favorable cases (when the posterior and prior are close). This illustrates the possibility of reinforcing classical generalization bounds with algorithm-and data-dependent components, thus making them more suitable to analyze algorithms that use a large hypothesis space.

artificial intelligence, inequality, machine learning, (16 more...)

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsApr-30-2026, 13:56:21 GMT

e17fe6fe9990fffb637b42c98c005515-Paper-Conference.pdf

data mining, machine learning, natural language, (21 more...)

Country: North America > United States (1.00)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.92)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (0.92)
Government > Regional Government > North America Government > United States Government (0.45)
Education > Educational Setting (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Neural Information Processing SystemsApr-29-2026, 10:08:28 GMT

adc98a266f45005c403b8311ca7e8bd7-Supplemental-Conference.pdf

artificial intelligence, machine learning, probability, (15 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

In addition to the work on noisy convex optimization, the current paper is also thematically related to works in learning theory and complexity where the goal is to reconstruct simple classes of functions under outlier noise. This includes work on reconstruction of low-degree polynomials [4, 14, 15]. In particular, [15] gave an efficient algorithm whose error tolerance matches the information theoretic limits. In addition, recently, [9] achieved similar algorithmic guarantees for functions which are sparse in the Fourier space. While similar in spirit, the model in these works differ from the current paper in one crucial way - namely, while we only put a bound on the volume of the outlier locations, they, in addition, assume that the outlier locations are also uniformly distributed in the domain. At a more technical level, the results in [4, 14, 15, 9] crucially rely on techniques originating from coding theory such as the Goldreich-Levin theorem [13] and the Berlekamp-Welch algorithm [6].

artificial intelligence, machine learning, probability, (19 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.34)