AITopics

2602.00781

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (1.00)
Banking & Finance > Trading (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Bo, Shi, Ghassami, AmirEmad

Causal Imitation Learning Under Measurement Error and Distribution Shift

We study offline imitation learning (IL) when part of the decision-relevant state is observed only through noisy measurements and the distribution may change between training and deployment. Such settings induce spurious state-action correlations, so standard behavioral cloning (BC) -- whether conditioning on raw measurements or ignoring them -- can converge to systematically biased policies under distribution shift. We propose a general framework for IL under measurement error, inspired by explicitly modeling the causal relationships among the variables, yielding a target that retains a causal interpretation and is robust to distribution shift. Building on ideas from proximal causal inference, we introduce \texttt{CausIL}, which treats noisy state observations as proxy variables, and we provide identification conditions under which the target policy is recoverable from demonstrations without rewards or interactive expert queries. We develop estimators for both discrete and continuous state spaces; for continuous settings, we use an adversarial procedure over RKHS function classes to learn the required parameters. We evaluate \texttt{CausIL} on semi-simulated longitudinal data from the PhysioNet/Computing in Cardiology Challenge 2019 cohort and demonstrate improved robustness to distribution shift compared to BC baselines.

arg max, artificial intelligence, machine learning, (15 more...)

2601.22206

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Simulation-based Bayesian inference with ameliorative learned summary statistics -- Part I

Befekadu, Getachew K.

This paper, which is Part 1 of a two-part paper series, considers a simulation-based inference with learned summary statistics, in which such a learned summary statistic serves as an empirical-likelihood with ameliorative effects in the Bayesian setting, when the exact likelihood function associated with the observation data and the simulation model is difficult to obtain in a closed form or computationally intractable. In particular, a transformation technique which leverages the Cressie-Read discrepancy criterion under moment restrictions is used for summarizing the learned statistics between the observation data and the simulation outputs, while preserving the statistical power of the inference. Here, such a transformation of data-to-learned summary statistics also allows the simulation outputs to be conditioned on the observation data, so that the inference task can be performed over certain sample sets of the observation data that are considered as an empirical relevance or believed to be particular importance. Moreover, the simulation-based inference framework discussed in this paper can be extended further, and thus handling weakly dependent observation data. Finally, we remark that such an inference framework is suitable for implementation in distributed computing, i.e., computational tasks involving both the data-to-learned summary statistics and the Bayesian inferencing problem can be posed as a unified distributed inference problem that will exploit distributed optimization and MCMC algorithms for supporting large datasets associated with complex simulation models.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

2601.22441

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Yallup, David, Kroupa, Namu, Handley, Will

Nested Slice Sampling: Vectorized Nested Sampling for GPU-Accelerated Inference

Model comparison and calibrated uncertainty quantification often require integrating over parameters, but scalable inference can be challenging for complex, multimodal targets. Nested Sampling is a robust alternative to standard MCMC, yet its typically sequential structure and hard constraints make efficient accelerator implementations difficult. This paper introduces Nested Slice Sampling (NSS), a GPU-friendly, vectorized formulation of Nested Sampling that uses Hit-and-Run Slice Sampling for constrained updates. A tuning analysis yields a simple near-optimal rule for setting the slice width, improving high-dimensional behavior and making per-step compute more predictable for parallel execution. Experiments on challenging synthetic targets, high dimensional Bayesian inference, and Gaussian process hyperparameter marginalization show that NSS maintains accurate evidence estimates and high-quality posterior samples, and is particularly robust on difficult multimodal problems where current state-of-the-art methods such as tempered SMC baselines can struggle. An open-source implementation is released to facilitate adoption and reproducibility.

artificial intelligence, machine learning, sampling, (19 more...)

2601.23252

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Turkmenistan > Ahal Region > Anau (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Balasubramanian, Krishnakumar, Podkopaev, Aleksandr, Kasiviswanathan, Shiva Prasad

Dependence-Aware Label Aggregation for LLM-as-a-Judge via Ising Models

Large-scale AI evaluation increasingly relies on aggregating binary judgments from $K$ annotators, including LLMs used as judges. Most classical methods, e.g., Dawid-Skene or (weighted) majority voting, assume annotators are conditionally independent given the true label $Y\in\{0,1\}$, an assumption often violated by LLM judges due to shared data, architectures, prompts, and failure modes. Ignoring such dependencies can yield miscalibrated posteriors and even confidently incorrect predictions. We study label aggregation through a hierarchy of dependence-aware models based on Ising graphical models and latent factors. For class-dependent Ising models, the Bayes log-odds is generally quadratic in votes; for class-independent couplings, it reduces to a linear weighted vote with correlation-adjusted parameters. We present finite-$K$ examples showing that methods based on conditional independence can flip the Bayes label despite matching per-annotator marginals. We prove separation results demonstrating that these methods remain strictly suboptimal as the number of judges grows, incurring nonvanishing excess risk under latent factors. Finally, we evaluate the proposed method on three real-world datasets, demonstrating improved performance over the classical baselines.

ising model, large language model, machine learning, (21 more...)

2601.22336

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Yolo County > Davis (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (0.65)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Sun, Shiyi, Nicholls, Geoff K., Lee, Jeong Eun

Amortized Simulation-Based Inference in Generalized Bayes via Neural Posterior Estimation

Generalized Bayesian Inference (GBI) tempers a loss with a temperature $β>0$ to mitigate overconfidence and improve robustness under model misspecification, but existing GBI methods typically rely on costly MCMC or SDE-based samplers and must be re-run for each new dataset and each $β$ value. We give the first fully amortized variational approximation to the tempered posterior family $p_β(θ\mid x) \propto π(θ)\,p(x \mid θ)^β$ by training a single $(x,β)$-conditioned neural posterior estimator $q_ϕ(θ\mid x,β)$ that enables sampling in a single forward pass, without simulator calls or inference-time MCMC. We introduce two complementary training routes: (i) synthesize off-manifold samples $(θ,x) \sim π(θ)\,p(x \mid θ)^β$ and (ii) reweight a fixed base dataset $π(θ)\,p(x \mid θ)$ using self-normalized importance sampling (SNIS). We show that the SNIS-weighted objective provides a consistent forward-KL fit to the tempered posterior with finite weight variance. Across four standard simulation-based inference (SBI) benchmarks, including the chaotic Lorenz-96 system, our $β$-amortized estimator achieves competitive posterior approximations in standard two-sample metrics, matching non-amortized MCMC-based power-posterior samplers over a wide range of temperatures.

artificial intelligence, bayesian inference, machine learning, (14 more...)

2601.22367

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Corrected Samplers for Discrete Flow Models

Wan, Zhengyan, Ouyang, Yidong, Xie, Liyan, Fang, Fang, Zha, Hongyuan, Cheng, Guang

Discrete flow models (DFMs) have been proposed to learn the data distribution on a finite state space, offering a flexible framework as an alternative to discrete diffusion models. A line of recent work has studied samplers for discrete diffusion models, such as tau-leaping and Euler solver. However, these samplers require a large number of iterations to control discretization error, since the transition rates are frozen in time and evaluated at the initial state within each time interval. Moreover, theoretical results for these samplers often require boundedness conditions of the transition rate or they focus on a specific type of source distributions. To address those limitations, we establish non-asymptotic discretization error bounds for those samplers without any restriction on transition rates and source distributions, under the framework of discrete flow models. Furthermore, by analyzing a one-step lower bound of the Euler sampler, we propose two corrected samplers: \textit{time-corrected sampler} and \textit{location-corrected sampler}, which can reduce the discretization error of tau-leaping and Euler solver with almost no additional computational cost. We rigorously show that the location-corrected sampler has a lower iteration complexity than existing parallel samplers. We validate the effectiveness of the proposed method by demonstrating improved generation quality and reduced inference time on both simulation and text-to-image generation tasks. Code can be found in https://github.com/WanZhengyan/Corrected-Samplers-for-Discrete-Flow-Models.

artificial intelligence, machine learning, sampler, (18 more...)

2601.22519

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Shahbaba, Babak, Moslemi, Zahra

Neural-Inspired Posterior Approximation (NIPA)

Humans learn efficiently from their environment by engaging multiple interacting neural systems that support distinct yet complementary forms of control, including model-based (goal-directed) planning, model-free (habitual) responding, and episodic memory-based learning. Model-based mechanisms compute prospective action values using an internal model of the environment, supporting flexible but computationally costly planning; model-free mechanisms cache value estimates and build heuristics that enable fast, efficient habitual responding; and memory-based mechanisms allow rapid adaptation from individual experience. In this work, we aim to elucidate the computational principles underlying this biological efficiency and translate them into a sampling algorithm for scalable Bayesian inference through effective exploration of the posterior distribution. More specifically, our proposed algorithm comprises three components: a model-based module that uses the target distribution for guided but computationally slow sampling; a model-free module that uses previous samples to learn patterns in the parameter space, enabling fast, reflexive sampling without directly evaluating the expensive target distribution; and an episodic-control module that supports rapid sampling by recalling specific past events (i.e., samples). We show that this approach advances Bayesian methods and facilitates their application to large-scale statistical machine learning problems. In particular, we apply our proposed framework to Bayesian deep learning, with an emphasis on proper and principled uncertainty quantification.

artificial intelligence, machine learning, proceedings, (14 more...)

2601.22539

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
(6 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Mueller, Markus, Gruber, Kathrin, Fok, Dennis

Cascaded Flow Matching for Heterogeneous Tabular Data with Mixed-Type Features

Advances in generative modeling have recently been adapted to tabular data containing discrete and continuous features. However, generating mixed-type features that combine discrete states with an otherwise continuous distribution in a single feature remains challenging. We advance the state-of-the-art in diffusion models for tabular data with a cascaded approach. We first generate a low-resolution version of a tabular data row, that is, the collection of the purely categorical features and a coarse categorical representation of numerical features. Next, this information is leveraged in the high-resolution flow matching model via a novel guided conditional probability path and data-dependent coupling. The low-resolution representation of numerical features explicitly accounts for discrete outcomes, such as missing or inflated values, and therewith enables a more faithful generation of mixed-type features. We formally prove that this cascade tightens the transport cost bound. The results indicate that our model generates significantly more realistic samples and captures distributional details more accurately, for example, the detection score increases by 40%.

artificial intelligence, deep learning, machine learning, (14 more...)

2601.22816

Country:

Asia > China > Beijing > Beijing (0.05)
Europe > Netherlands > South Holland > Rotterdam (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Baker, Elizabeth L., Denker, Alexander, Frellsen, Jes

Supervised Guidance Training for Infinite-Dimensional Diffusion Models

arXiv.org Machine LearningJan-30-2026

Score-based diffusion models have recently been extended to infinite-dimensional function spaces, with uses such as inverse problems arising from partial differential equations. In the Bayesian formulation of inverse problems, the aim is to sample from a posterior distribution over functions obtained by conditioning a prior on noisy observations. While diffusion models provide expressive priors in function space, the theory of conditioning them to sample from the posterior remains open. We address this, assuming that either the prior lies in the Cameron-Martin space, or is absolutely continuous with respect to a Gaussian measure. We prove that the models can be conditioned using an infinite-dimensional extension of Doob's $h$-transform, and that the conditional score decomposes into an unconditional score and a guidance term. As the guidance term is intractable, we propose a simulation-free score matching objective (called Supervised Guidance Training) enabling efficient and stable posterior sampling. We illustrate the theory with numerical examples on Bayesian inverse problems in function spaces. In summary, our work offers the first function-space method for fine-tuning trained diffusion models to accurately sample from a posterior.

artificial intelligence, diffusion model, machine learning, (18 more...)

2601.20756

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Denmark (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)