AITopics

2510.04042

Country:

Europe > United Kingdom (0.14)
North America > United States > Arizona (0.04)
Europe > Switzerland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry: Energy (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Chen, Yujie, Chakraborty, Antik, Bhadra, Anindya

Exact and Approximate MCMC for Doubly-intractable Probabilistic Graphical Models Leveraging the Underlying Independence Model

arXiv.org Machine LearningOct-7-2025

Bayesian inference for doubly-intractable probabilistic graphical models typically involves variations of the exchange algorithm or approximate Markov chain Monte Carlo (MCMC) samplers. However, existing methods for both classes of algorithms require either perfect samplers or sequential samplers for complex models, which are often either not available, or suffer from poor mixing, especially in high dimensions. We develop a method that does not require perfect or sequential sampling, and can be applied to both classes of methods: exact and approximate MCMC. The key to our approach is to utilize the tractable independence model underlying an intractable probabilistic graphical model for the purpose of constructing a finite sample unbiased Monte Carlo (and not MCMC) estimate of the Metropolis--Hastings ratio. This innovation turns out to be crucial for scalability in high dimensions. The method is demonstrated on the Ising model. Gradient-based alternatives to construct a proposal, such as Langevin and Hamiltonian Monte Carlo approaches, also arise as a natural corollary to our general procedure, and are demonstrated as well.

estimator, sampler, unbiased estimator, (17 more...)

2510.03587

Country:

Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
North America > United States > Indiana (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Media > Film (0.69)
Leisure & Entertainment (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Wang, Yangyang, Fabusuyi, Tayo

MICROTRIPS: MICRO-geography TRavel Intelligence and Pattern Synthesis

This study presents a novel small-area estimation framework to enhance urban transportation planning through detailed characterization of travel behavior. Our approach improves on the four-step travel model by employing publicly available microdata files and machine learning methods to predict travel behavior for a representative, synthetic population at small geographic areas. This approach enables high-resolution estimation of trip generation, trip distribution, mode choice, and route assignment. Validation using ACS/PUMS work-commute datasets demonstrates that our framework achieves higher accuracy compared to conventional approaches. The resulting granular insights enable the tailoring of interventions to address localized situations and support a range of policy applications and targeted interventions, including the optimal placement of micro-fulfillment centers, effective curb-space management, and the design of more inclusive transportation solutions particularly for vulnerable communities.

artificial intelligence, machine learning, travel behavior, (19 more...)

2510.0508

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.95)
Transportation > Infrastructure & Services (0.94)
Transportation > Ground > Road (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Russo, Alessio, Welch, Ryan, Pacchiano, Aldo

In-Context Learning for Pure Exploration

We study the problem active sequential hypothesis testing, also known as pure exploration: given a new task, the learner adaptively collects data from the environment to efficiently determine an underlying correct hypothesis. A classical instance of this problem is the task of identifying the best arm in a multi-armed bandit problem (a.k.a. BAI, Best-Arm Identification), where actions index hypotheses. Another important case is generalized search, a problem of determining the correct label through a sequence of strategically selected queries that indirectly reveal information about the label. In this work, we introduce In-Context Pure Exploration (ICPE), which meta-trains Transformers to map observation histories to query actions and a predicted hypothesis, yielding a model that transfers in-context. At inference time, ICPE actively gathers evidence on new tasks and infers the true hypothesis without parameter updates. Across deterministic, stochastic, and structured benchmarks, including BAI and generalized search, ICPE is competitive with adaptive baselines while requiring no explicit modeling of information structure. Our results support Transformers as practical architectures for general sequential testing.

in-context learning, large language model, machine learning, (19 more...)

2506.01876

Country:

North America > United States (0.67)
Europe (0.45)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.66)

Industry:

Education (0.67)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
(6 more...)

A Trustworthy Industrial Fault Diagnosis Architecture Integrating Probabilistic Models and Large Language Models

wu, Yue

Abstract: Addressing the core problem of insufficient trustworthiness in industrial fault diagnosis, stemming from the limitations of existing methods -- both traditional and deep learning - based -- in terms of interpretability, generalization, and uncertainty quantification, this paper proposes a trustworthy industrial fault diagnosis architecture, the Hierarchical Cognitive Arbitration Architecture (HCAA), which integrates probabilistic models with Large Language Models (LLMs). The architecture conducts a preliminary analysis via a diagnostic engine based on a Bayesian network and features an LLM - driven cognitive arbitration module with multimodal input capabilities. This module performs expert - level arbitration on the initial diagnosis by analyzing structured features and diagnostic charts, holding the priority to make the final decision upon detecting conflicts. To ensure the reliability of the system's output, the architecture integrates a confidence calibration module based on Temperature Scaling and a risk assessment module, which objectively quantify system trustworthiness using metrics like Expected Calibration Error (ECE). Experimental results on a dataset containing multiple fault types demonstrate that the proposed framework improves diagnostic accuracy by over 28 percentage points compared to baseline models, while the post - calibration ECE is reduced by more than 75%. Case studies confirm that the HCAA effectively corrects misjudgments from traditional models caused by complex feature patterns or knowledge gaps, providing a novel and practical engineering solution for building high - trust, explainable AI diagnostic systems for industrial applications. Keywords: Industrial Fault Diagnosis; Large Language Model (LLM); Hierarchical Cognitive Arbitration; Probabilistic Model; Confidence Calibration; Trustworthy AI 1. Introduction With the deep development of Industry 4.0 and smart manufacturing concepts, modern industrial systems are evolving towards high levels of automation and intelligence. In this process, the reliability and safety of equipment have become key factors determining production efficiency and operational costs. Prognostics and Health Management (PHM), as a core technology, plays an indispensable role in improving equipment reliability, reducing unplanned downtime, and optimizing maintenance costs by monitoring equipment status in real - time, diagnosing potential faults, and predicting remaining useful life [1], [2].

large language model, machine learning, natural language, (20 more...)

2510.03815

Genre: Research Report (1.00)

Industry:

Law (0.99)
Health & Medicine > Diagnostic Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
(3 more...)

Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling

Dang, Meihua, Han, Jiaqi, Xu, Minkai, Xu, Kai, Srivastava, Akash, Ermon, Stefano

Discrete diffusion models have recently emerged as strong alternatives to autoregressive language models, matching their performance through large-scale training. However, inference-time control remains relatively underexplored. In this work, we study how to steer generation toward desired rewards without retraining the models. Prior methods typically resample or filter within a single denoising trajectory, optimizing rewards step-by-step without trajectory-level refinement. We introduce particle Gibbs sampling for diffusion language models (PG-DLM), a novel inference-time algorithm enabling trajectory-level refinement while preserving generation perplexity under reward optimization. PG-DLM constructs a Markov chain over full denoising trajectories and applies a conditional sequential Monte Carlo kernel to resample them. We derive theoretical guarantees for convergence, including asymptotic consistency and variance bounds. Within this framework, we further analyze trade-offs across four key axes for inference-time scaling under fixed budgets: iterations, samples, denoising steps, and reward estimation. Our analysis shows scaling iterations achieves the best reward-perplexity trade-off. Empirically, PG-DLM consistently outperforms prior methods using MDLM and LLaDA-8B as base models across a wide range of compute budgets for reward-guided generation tasks including toxicity and sentiment control as well as linguistic acceptability.

diffusion model, machine learning, natural language, (15 more...)

2507.0839

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.60)

Ascolani, Filippo, Zanella, Giacomo

A fast non-reversible sampler for Bayesian finite mixture models

arXiv.org Machine LearningOct-6-2025

Finite mixtures are a cornerstone of Bayesian modelling, and it is well-known that sampling from the resulting posterior distribution can be a hard task. In particular, popular reversible Markov chain Monte Carlo schemes are often slow to converge when the number of observations $n$ is large. In this paper we introduce a novel and simple non-reversible sampling scheme for Bayesian finite mixture models, which is shown to drastically outperform classical samplers in many scenarios of interest, especially during convergence phase and when components in the mixture have non-negligible overlap. At the theoretical level, we show that the performance of the proposed non-reversible scheme cannot be worse than the standard one, in terms of asymptotic variance, by more than a factor of four; and we provide a scaling limit analysis suggesting that the non-reversible sampler can reduce the convergence time from O$(n^2)$ to O$(n)$. We also discuss why the statistical features of mixture models make them an ideal case for the use of non-reversible discrete samplers.

iteration, mixture model, sampler, (14 more...)

2510.03226

Country:

North America > United States > North Carolina > Durham County > Durham (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Mildner, Terje, Giampouras, Paris, Damoulas, Theodoros

Rates of Convergence of Generalised Variational Inference Posteriors under Prior Misspecification

arXiv.org Machine LearningOct-6-2025

We prove rates of convergence and robustness to prior misspecification within a Generalised Variational Inference (GVI) framework with bounded divergences. This addresses a significant open challenge for GVI and Federated GVI that employ a different divergence to the Kullback--Leibler under prior misspecification, operate within a subset of possible probability measures, and result in intractable posteriors. Our theoretical contributions cover severe prior misspecification while relying on our ability to restrict the space of possible GVI posterior measures, and infer properties based on this space. In particular, we are able to establish sufficient conditions for existence and uniqueness of GVI posteriors on arbitrary Polish spaces, prove that the GVI posterior measure concentrates on a neighbourhood of loss minimisers, and extend this to rates of convergence regardless of the prior measure.

divergence, gvi posterior, posterior, (13 more...)

2510.03109

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > West Midlands > Coventry (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Dang, Thanh, Liang, Jiaming

Oracle-based Uniform Sampling from Convex Bodies

arXiv.org Machine LearningOct-6-2025

We propose new Markov chain Monte Carlo algorithms to sample a uniform distribution on a convex body $K$. Our algorithms are based on the Alternating Sampling Framework/proximal sampler, which uses Gibbs sampling on an augmented distribution and assumes access to the so-called restricted Gaussian oracle (RGO). The key contribution of this work is the efficient implementation of RGO for uniform sampling on $K$ via rejection sampling and access to either a projection oracle or a separation oracle on $K$. In both oracle cases, we establish non-asymptotic complexities to obtain unbiased samples where the accuracy is measured in Rényi divergence or $χ^2$-divergence.

algorithm, oracle, rejection, (16 more...)

2510.02983

Country: North America > United States > New York > Monroe County > Rochester (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

arXiv.org Artificial IntelligenceOct-6-2025

Injecting Measurement Information Yields a Fast and Noise-Robust Diffusion-Based Inverse Problem Solver

Patsenker, Jonathan, Li, Henry, Ko, Myeongseob, Jia, Ruoxi, Kluger, Yuval

Diffusion models have been firmly established as principled zero-shot solvers for linear and nonlinear inverse problems, owing to their powerful image prior and iterative sampling algorithm. These approaches often rely on Tweedie's formula, which relates the diffusion variate $\mathbf{x}_t$ to the posterior mean $\mathbb{E} [\mathbf{x}_0 | \mathbf{x}_t]$, in order to guide the diffusion trajectory with an estimate of the final denoised sample $\mathbf{x}_0$. However, this does not consider information from the measurement $\mathbf{y}$, which must then be integrated downstream. In this work, we propose to estimate the conditional posterior mean $\mathbb{E} [\mathbf{x}_0 | \mathbf{x}_t, \mathbf{y}]$, which can be formulated as the solution to a lightweight, single-parameter maximum likelihood estimation problem. The resulting prediction can be integrated into any standard sampler, resulting in a fast and memory-efficient inverse solver. Our optimizer is amenable to a noise-aware likelihood-based stopping criteria that is robust to measurement noise in $\mathbf{y}$. We demonstrate comparable or improved performance against a wide selection of contemporary inverse solvers across multiple datasets and tasks.

artificial intelligence, inverse problem, machine learning, (17 more...)

2508.02964

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)