AITopics

2601.08527

Country: Europe > Germany (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

arXiv.org Machine LearningJan-14-2026

Structural Dimension Reduction in Bayesian Networks

Heng, Pei, Sun, Yi, Guo, Jianhua

This work introduces a novel technique, named structural dimension reduction, to collapse a Bayesian network onto a minimum and localized one while ensuring that probabilistic inferences between the original and reduced networks remain consistent. To this end, we propose a new combinatorial structure in directed acyclic graphs called the directed convex hull, which has turned out to be equivalent to their minimum localized Bayesian networks. An efficient polynomial-time algorithm is devised to identify them by determining the unique directed convex hulls containing the variables of interest from the original networks. Experiments demonstrate that the proposed technique has high dimension reduction capability in real networks, and the efficiency of probabilistic inference based on directed convex hulls can be significantly improved compared with traditional methods such as variable elimination and belief propagation algorithms. The code of this study is open at \href{https://github.com/Balance-H/Algorithms}{https://github.com/Balance-H/Algorithms} and the proofs of the results in the main body are postponed to the appendix.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2601.08236

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Khribch, EL Mahdi, Alquier, Pierre

Variational Approximations for Robust Bayesian Inference via Rho-Posteriors

The $ρ$-posterior framework provides universal Bayesian estimation with explicit contamination rates and optimal convergence guarantees, but has remained computationally difficult due to an optimization over reference distributions that precludes intractable posterior computation. We develop a PAC-Bayesian framework that recovers these theoretical guarantees through temperature-dependent Gibbs posteriors, deriving finite-sample oracle inequalities with explicit rates and introducing tractable variational approximations that inherit the robustness properties of exact $ρ$-posteriors. Numerical experiments demonstrate that this approach achieves theoretical contamination rates while remaining computationally feasible, providing the first practical implementation of $ρ$-posterior inference with rigorous finite-sample guarantees.

artificial intelligence, inequality, machine learning, (18 more...)

2601.07325

Country: North America > United States (0.45)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Hu, Yinan, Tabak, Estaban

Constrained Density Estimation via Optimal Transport

The classical optimal transport (OT) problem seeks the map that moves mass from a source to a target measure while minimizing a prescribed cost function. The objective can be formalized in either Monge's [12] or Kantronich's formulation [10], a convex relaxation of the former that considers transport plans instead of deterministic maps. These foundational formulations have wide-ranging applications, including to economics [7] and machine learning [14]. In many practical scenarios, the source measure is known or readily in-ferrable from empirical data but the target measure is not explicitly specified. Instead, it is only constrained by practical requirements or expert knowledge. For example, when applying Monge's formulation to transportation problems, the placement of the mass in the target region may be constrained to lie entirely beyond a certain boundary or within a particular region, rather than by the specification of a precise location for each fraction of the total mass. Similarly, in economic applications, supply and demand may be subject to constraints such as maximal amounts available or minimal amounts required, rather than dictated through precise marginal distributions. 1

artificial intelligence, constraint, machine learning, (16 more...)

2601.0683

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.82)

Diffusion Models with Heavy-Tailed Targets: Score Estimation and Sampling Guarantees

Yu, Yifeng, Yu, Lu

Score-based diffusion models have become a powerful framework for generative modeling, with score estimation as a central statistical bottleneck. Existing guarantees for score estimation largely focus on light-tailed targets or rely on restrictive assumptions such as compact support, which are often violated by heavy-tailed data in practice. In this work, we study conventional (Gaussian) score-based diffusion models when the target distribution is heavy-tailed and belongs to a Sobolev class with smoothness parameter $β>0$. We consider both exponential and polynomial tail decay, indexed by a tail parameter $γ$. Using kernel density estimation, we derive sharp minimax rates for score estimation, revealing a qualitative dichotomy: under exponential tails, the rate matches the light-tailed case up to polylogarithmic factors, whereas under polynomial tails the rate depends explicitly on $γ$. We further provide sampling guarantees for the associated continuous reverse dynamics. In total variation, the generated distribution converges at the minimax optimal rate $n^{-β/(2β+d)}$ under exponential tails (up to logarithmic factors), and at a $γ$-dependent rate under polynomial tails. Whether the latter sampling rate is minimax optimal remains an open question. These results characterize the statistical limits of score estimation and the resulting sampling accuracy for heavy-tailed targets, extending diffusion theory beyond the light-tailed setting.

artificial intelligence, arxiv preprint arxiv, machine learning, (17 more...)

2601.06715

Genre: Research Report (0.81)

Industry: Energy (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

Inference-Time Alignment for Diffusion Models via Doob's Matching

Chang, Jinyuan, Duan, Chenguang, Jiao, Yuling, Xu, Yi, Yang, Jerry Zhijian

Inference-time alignment for diffusion models aims to adapt a pre-trained diffusion model toward a target distribution without retraining the base score network, thereby preserving the generative capacity of the base model while enforcing desired properties at the inference time. A central mechanism for achieving such alignment is guidance, which modifies the sampling dynamics through an additional drift term. In this work, we introduce Doob's matching, a novel framework for guidance estimation grounded in Doob's $h$-transform. Our approach formulates guidance as the gradient of logarithm of an underlying Doob's $h$-function and employs gradient-penalized regression to simultaneously estimate both the $h$-function and its gradient, resulting in a consistent estimator of the guidance. Theoretically, we establish non-asymptotic convergence rates for the estimated guidance. Moreover, we analyze the resulting controllable diffusion processes and prove non-asymptotic convergence guarantees for the generated distributions in the 2-Wasserstein distance.

artificial intelligence, diffusion model, machine learning, (16 more...)

2601.06514

Country:

Europe (0.67)
Asia > China (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Biggs, Felix, Willis, Samuel

LLM Flow Processes for Text-Conditioned Regression

Meta-learning methods for regression like Neural (Diffusion) Processes achieve impressive results, but with these models it can be difficult to incorporate expert prior knowledge and information contained in metadata. Large Language Models (LLMs) are trained on giant corpora including varied real-world regression datasets alongside their descriptions and metadata, leading to impressive performance on a range of downstream tasks. Recent work has extended this to regression tasks and is able to leverage such prior knowledge and metadata, achieving surprisingly good performance, but this still rarely matches dedicated meta-learning methods. Here we introduce a general method for sampling from a product-of-experts of a diffusion or flow matching model and an `expert' with binned probability density; we apply this to combine neural diffusion processes with LLM token probabilities for regression (which may incorporate textual knowledge), exceeding the empirical performance of either alone.

large language model, machine learning, natural language, (19 more...)

2601.06147

Country:

Europe (1.00)
North America > United States (0.47)
North America > Canada (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Ge, Shufei, Wang, Shijia, Elliott, Lloyd

Poisson Hyperplane Processes with Rectified Linear Units

arXiv.org Machine LearningJan-12-2026

Neural networks have shown state-of-the-art performances in various classification and regression tasks. Rectified linear units (ReLU) are often used as activation functions for the hidden layers in a neural network model. In this article, we establish the connection between the Poisson hyperplane processes (PHP) and two-layer ReLU neural networks. We show that the PHP with a Gaussian prior is an alternative probabilistic representation to a two-layer ReLU neural network. In addition, we show that a two-layer neural network constructed by PHP is scalable to large-scale problems via the decomposition propositions. Finally, we propose an annealed sequential Monte Carlo algorithm for Bayesian inference. Our numerical experiments demonstrate that our proposed method outperforms the classic two-layer ReLU neural network. The implementation of our proposed model is available at https://github.com/ShufeiGe/Pois_Relu.git.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2601.05586

Country: North America (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Thorat, Rohan Vitthal, Nayek, Rajdip

Machine learning assisted state prediction of misspecified linear dynamical system via modal reduction

arXiv.org Machine LearningJan-12-2026

Machine learning assisted state prediction of misspecified linear dynamical system via modal reduction Rohan Vittal Thorat a, Rajdip Nayek a a Department of Applied Mechanics, Indian Institute of Technology Delhi, New Delhi, 110016, IndiaAbstract Accurate prediction of structural dynamics is imperative for preserving digital twin fidelity throughout operational lifetimes. Parametric models with fixed nominal parameters often omit critical physical effects due to simplifications in geometry, material behavior, damping, or boundary conditions, resulting in model form errors (MFEs) that impair predictive accuracy. This work introduces a comprehensive framework for MFE estimation and correction in high-dimensional finite element (FE) based structural dynamical systems. The Gaussian Process Latent Force Model (GPLFM) represents discrepancies non-parametrically in the reduced modal domain, allowing a flexible data-driven characterization of unmodeled dynamics. A linear Bayesian filtering approach jointly estimates system states and discrepancies, incorporating epistemic and aleatoric uncertainties. To ensure computational tractability, the FE system is projected onto a reduced modal basis, and a mesh-invariant neural network maps modal states to discrepancy estimates, permitting model rectification across different FE dis-cretizations without retraining. Validation is undertaken across five MFE scenarios--including incorrect beam theory, damping misspecification, misspecified boundary condition, unmodeled material nonlinearity, and local damage --demonstrating the surrogate model's substantial reduction of displacement and rotation prediction errors under unseen excitations. The proposed methodology offers a potential means to uphold digital twin accuracy amid inherent modeling uncertainties. Keywords: Model bias, Gaussian Process, Latent Force Model, Bayesian filtering, Modal reduction, Digital twin 1. Introduction The reliable simulation of structural dynamical systems is central to engineering analysis, design, and decision-making. In practice, high-fidelity models are often impractical due to limited information, computational constraints, or simplifying assumptions in geometry, boundary conditions, damping mechanisms, and material constitutive laws. These idealizations lead to model form errors (MFEs)--systematic discrepancies between the predicted and actual system responses--which, if unaccounted for, can significantly degrade predictive accuracy. This challenge is especially critical in the context of digital twins, where model predictions directly inform monitoring and decision-making. Digital twins of structural systems integrate computational models with real-time or historical measurement data to enable continuous prediction, monitoring, and decision making [1, 2].

artificial intelligence, latent force, machine learning, (19 more...)

2601.05297

Country:

Europe (0.46)
North America > United States (0.28)
Asia > India > NCT > New Delhi (0.24)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Chen, Jiaheng, Sanz-Alonso, Daniel

Convergence Rates for Learning Pseudo-Differential Operators

arXiv.org Machine LearningJan-9-2026

This paper establishes convergence rates for learning elliptic pseudo-differential operators, a fundamental operator class in partial differential equations and mathematical physics. In a wavelet-Galerkin framework, we formulate learning over this class as a structured infinite-dimensional regression problem with multiscale sparsity. Building on this structure, we propose a sparse, data- and computation-efficient estimator, which leverages a novel matrix compression scheme tailored to the learning task and a nested-support strategy to balance approximation and estimation errors. In addition to obtaining convergence rates for the estimator, we show that the learned operator induces an efficient and stable Galerkin solver whose numerical error matches its statistical accuracy. Our results therefore contribute to bringing together operator learning, data-driven solvers, and wavelet methods in scientific computing.

artificial intelligence, machine learning, operator, (19 more...)

2601.04473

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
(2 more...)