AITopics

2512.05267

Country:

Europe (0.46)
Asia > Middle East (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Nelsen, Nicholas H., Yang, Yunan

Operator learning meets inverse problems: A probabilistic perspective

arXiv.org Artificial IntelligenceDec-8-2025

Operator learning offers a robust framework for approximating mappings between infinite-dimensional function spaces. It has also become a powerful tool for solving inverse problems in the computational sciences. This chapter surveys methodological and theoretical developments at the intersection of operator learning and inverse problems. It begins by summarizing the probabilistic and deterministic approaches to inverse problems, and pays special attention to emerging measure-centric formulations that treat observed data or unknown parameters as probability distributions. The discussion then turns to operator learning by covering essential components such as data generation, loss functions, and widely used architectures for representing function-to-function maps. The core of the chapter centers on the end-to-end inverse operator learning paradigm, which aims to directly map observed data to the solution of the inverse problem without requiring explicit knowledge of the forward map. It highlights the unique challenge that noise plays in this data-driven inversion setting, presents structure-aware architectures for both point predictions and posterior estimates, and surveys relevant theory for linear and nonlinear inverse problems. The chapter also discusses the estimation of priors and regularizers, where operator learning is used more selectively within classical inversion algorithms.

artificial intelligence, inverse problem, machine learning, (21 more...)

2508.20207

Country: North America > United States > Texas (0.27)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Scientific Computing (1.00)
Information Technology > Mathematics of Computing (1.00)
Information Technology > Data Science (1.00)
(6 more...)

Howard, Samuel, Nüsken, Nikolas, Pidstrigach, Jakiw

Control Consistency Losses for Diffusion Bridges

arXiv.org Machine LearningDec-5-2025

Simulating the conditioned dynamics of diffusion processes, given their initial and terminal states, is an important but challenging problem in the sciences. The difficulty is particularly pronounced for rare events, for which the unconditioned dynamics rarely reach the terminal state. In this work, we leverage a self-consistency property of the conditioned dynamics to learn the diffusion bridge in an iterative online manner, and demonstrate promising empirical results in a range of settings.

bridge, self-consistency property, trajectory, (16 more...)

arXiv.org Machine Learning

2512.0507

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

arXiv.org Machine LearningDec-5-2025

Learning Causality for Longitudinal Data

Bouchattaoui, Mouad EL

This thesis develops methods for causal inference and causal representation learning (CRL) in high-dimensional, time-varying data. The first contribution introduces the Causal Dynamic Variational Autoencoder (CDVAE), a model for estimating Individual Treatment Effects (ITEs) by capturing unobserved heterogeneity in treatment response driven by latent risk factors that affect only outcomes. CDVAE comes with theoretical guarantees on valid latent adjustment and generalization bounds for ITE error. Experiments on synthetic and real datasets show that CDVAE outperforms baselines, and that state-of-the-art models greatly improve when augmented with its latent substitutes, approaching oracle performance without access to true adjustment variables. The second contribution proposes an efficient framework for long-term counterfactual regression based on RNNs enhanced with Contrastive Predictive Coding (CPC) and InfoMax. It captures long-range dependencies under time-varying confounding while avoiding the computational cost of transformers, achieving state-of-the-art results and introducing CPC into causal inference. The third contribution advances CRL by addressing how latent causes manifest in observed variables. We introduce a model-agnostic interpretability layer based on the geometry of the decoder Jacobian. A sparse self-expression prior induces modular, possibly overlapping groups of observed features aligned with shared latent influences. We provide recovery guarantees in both disjoint and overlapping settings and show that meaningful latent-to-observed structure can be recovered without anchor features or single-parent assumptions. Scalable Jacobian-based regularization techniques are also developed.

artificial intelligence organization, lower-dimensional representation, unit treatment value assumption, (15 more...)

arXiv.org Machine Learning

2512.0498

Country:

North America > United States > California > San Francisco County > San Francisco (0.13)
Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
(2 more...)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Epidemiology (1.00)
(3 more...)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
(8 more...)

Amortized Inference of Multi-Modal Posteriors using Likelihood-Weighted Normalizing Flows

Baruah, Rajneil

Across diverse domains--from complex systems and finance to high-energy physics and astrophysics--scientific inquiry often relies on deriving theoretical parameters from observational data [1]. At the core of this challenge lies the inverse problem: inferring the posterior distribution of theoretical parameters given a set of observables [2]. Traditional approaches for posterior estimation rely on sampling algorithms such as Markov Chain Monte Carlo (MCMC) [3, 4] and Nested Sampling (NS) [5]. In astrophysics and cosmology, implementations like emcee [6] and dynesty [7] have become standard tools. While these frameworks are statistically robust, they suffer significantly from the curse of dimensionality. In real-world scenarios, where the parameter space is high-dimensional and the likelihood function relies on computationally expensive simulators (e.g., in particle physics phenomenology [8]), convergence can take weeks or even months. Recent advances in machine learning have introduced Normalizing Flows (NFs) as a powerful alternative for probabilistic modelling [9, 10]. By learning a bijective mapping between a simple base distribution (e.g., a Gaussian) and the complex target distribution, NFs allow for exact density estimation and efficient sampling [11] from the target distribution. Modern architectures, such as RealNVP [12] and Neural Spline Flows [13], offer enough expressivity to model highly complex distributions.

artificial intelligence, base distribution, machine learning, (16 more...)

2512.04954

Country: Asia > India (0.47)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Score Matching for Estimating Finite Point Processes

Cao, Haoqun, Zhang, Yixuan, Zhou, Feng

Score matching estimators have garnered significant attention in recent years because they eliminate the need to compute normalizing constants, thereby mitigating the computational challenges associated with maximum likelihood estimation (MLE).While several studies have proposed score matching estimators for point processes, this work highlights the limitations of these existing methods, which stem primarily from the lack of a mathematically rigorous analysis of how score matching behaves on finite point processes -- special random configurations on bounded spaces where many of the usual assumptions and properties of score matching no longer hold. To this end, we develop a formal framework for score matching on finite point processes via Janossy measures and, within this framework, introduce an (autoregressive) weighted score-matching estimator, whose statistical properties we analyze in classical parametric settings. For general nonparametric (e.g., deep) point process models, we show that score matching alone does not uniquely identify the ground-truth distribution due to subtle normalization issues, and we propose a simple survival-classification augmentation that yields a complete, integration-free training objective for any intensity-based point process model for spatio-temporal case. Experiments on synthetic and real-world temporal and spatio-temporal datasets, demonstrate that our method accurately recovers intensities and achieves performance comparable to MLE with better efficiency.

artificial intelligence, machine learning, point process, (15 more...)

2512.04617

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Bridging Probabilistic Inference and Behavior Trees: An Interactive Framework for Adaptive Multi-Robot Cooperation

Wang, Chaoran, Sun, Jingyuan, Zhang, Yanhui, Wu, Changju

This paper proposes an Interactive Inference Behavior Tree (IIBT) framework that integrates behavior trees (BTs) with active inference under the free energy principle for distributed multi-robot decision-making. The proposed IIBT node extends conventional BTs with probabilistic reasoning, enabling online joint planning and execution across multiple robots. It remains fully compatible with standard BT architectures, allowing seamless integration into existing multi-robot control systems. Within this framework, multi-robot cooperation is formulated as a free-energy minimization process, where each robot dynamically updates its preference matrix based on perceptual inputs and peer intentions, thereby achieving adaptive coordination in partially observable and dynamic environments. The proposed approach is validated through both simulation and real-world experiments, including a multi-robot maze navigation and a collaborative manipulation task, compared against traditional BTs(https://youtu.be/KX_oT3IDTf4). Experimental results demonstrate that the IIBT framework reduces BT node complexity by over 70%, while maintaining robust, interpretable, and adaptive cooperative behavior under environmental uncertainty.

artificial intelligence, bayesian inference, robot, (18 more...)

2512.04404

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Practical Global and Local Bounds in Gaussian Process Regression via Chaining

Liu, Junyi, Kok, Stanley

Gaussian process regression (GPR) is a popular nonparametric Bayesian method that provides predictive uncertainty estimates and is widely used in safety-critical applications. While prior research has introduced various uncertainty bounds, most existing approaches require access to specific input features, and rely on posterior mean and variance estimates or the tuning of hyperparameters. These limitations hinder robustness and fail to capture the model's global behavior in expectation. To address these limitations, we propose a chaining-based framework for estimating upper and lower bounds on the expected extreme values over unseen data, without requiring access to specific input features. We provide kernel-specific refinements for commonly used kernels such as RBF and Matérn, in which our bounds are tighter than generic constructions. We further improve numerical tightness by avoiding analytical relaxations. In addition to global estimation, we also develop a novel method for local uncertainty quantification at specified inputs. This approach leverages chaining geometry through partition diameters, adapting to local structures without relying on posterior variance scaling. Our experimental results validate the theoretical findings and demonstrate that our method outperforms existing approaches on both synthetic and real-world datasets.

artificial intelligence, exp, machine learning, (16 more...)

2511.09144

Country: North America > United States (0.68)

Genre: Research Report > Experimental Study (0.69)

Technology:

Information Technology > Software (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Gladin, Egor, Kroshnin, Alexey, Zhu, Jia-Jie, Dvurechensky, Pavel

Improved Stochastic Optimization of LogSumExp

The LogSumExp function, also known as the free energy, plays a central role in many important optimization problems, including entropy-regularized optimal transport and distributionally robust optimization (DRO). It is also the dual to the Kullback-Leibler (KL) divergence, which is widely used in machine learning. In practice, when the number of exponential terms inside the logarithm is large or infinite, optimization becomes challenging since computing the gradient requires differentiating every term. Previous approaches that replace the full sum with a small batch introduce significant bias. We propose a novel approximation to LogSumExp that can be efficiently optimized using stochastic gradient methods. This approximation is rooted in a sound modification of the KL divergence in the dual, resulting in a new $f$-divergence called the safe KL divergence. The accuracy of the approximation is controlled by a tunable parameter and can be made arbitrarily small. Like the LogSumExp, our approximation preserves convexity. Moreover, when applied to an $L$-smooth function bounded from below, the smoothness constant of the resulting objective scales linearly with $L$. Experiments in DRO and continuous optimal transport demonstrate the advantages of our approach over state-of-the-art baselines and the effective treatment of numerical issues associated with the standard LogSumExp and KL.

approximation, artificial intelligence, machine learning, (18 more...)

2509.24894

Country: Europe > Germany (0.14)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningDec-4-2025

Sketch Tomography: Hybridizing Classical Shadow and Matrix Product State

Tang, Xun, Chen, Haoxuan, Khoo, Yuehaw, Ying, Lexing

We introduce Sketch Tomography, an efficient procedure for quantum state tomography based on the classical shadow protocol used for quantum observable estimations. The procedure applies to the case where the ground truth quantum state is a matrix product state (MPS). The density matrix of the ground truth state admits a tensor train ansatz as a result of the MPS assumption, and we estimate the tensor components of the ansatz through a series of observable estimations, thus outputting an approximation of the density matrix. The procedure is provably convergent with a sample complexity that scales quadratically in the system size. We conduct extensive numerical experiments to show that the procedure outputs an accurate approximation to the quantum state. For observable estimation tasks involving moderately large subsystems, we show that our procedure gives rise to a more accurate estimation than the classical shadow protocol. We also show that sketch tomography is more accurate in observable estimation than quantum states trained from the maximum likelihood estimation formulation.

equation, tensor component, tomography, (11 more...)

arXiv.org Machine Learning

2512.03333

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre:

Overview (0.67)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)