Uncertainty
MICROTRIPS: MICRO-geography TRavel Intelligence and Pattern Synthesis
Wang, Yangyang, Fabusuyi, Tayo
This study presents a novel small-area estimation framework to enhance urban transportation planning through detailed characterization of travel behavior. Our approach improves on the four-step travel model by employing publicly available microdata files and machine learning methods to predict travel behavior for a representative, synthetic population at small geographic areas. This approach enables high-resolution estimation of trip generation, trip distribution, mode choice, and route assignment. Validation using ACS/PUMS work-commute datasets demonstrates that our framework achieves higher accuracy compared to conventional approaches. The resulting granular insights enable the tailoring of interventions to address localized situations and support a range of policy applications and targeted interventions, including the optimal placement of micro-fulfillment centers, effective curb-space management, and the design of more inclusive transportation solutions particularly for vulnerable communities.
Diffusion^2: Turning 3D Environments into Radio Frequency Heatmaps
Park, Kyoungjun, Yang, Yifan, Ge, Changhan, Qiu, Lili, Jiang, Shiqi
Modeling radio frequency (RF) signal propagation is essential for understanding the environment, as RF signals offer valuable insights beyond the capabilities of RGB cameras, which are limited by the visible-light spectrum, lens coverage, and occlusions. It is also useful for supporting wireless diagnosis, deployment, and optimization. However, accurately predicting RF signals in complex environments remains a challenge due to interactions with obstacles such as absorption and reflection. We introduce Diffusion^2, a diffusion-based approach that uses 3D point clouds to model the propagation of RF signals across a wide range of frequencies, from Wi-Fi to millimeter waves. To effectively capture RF-related features from 3D data, we present the RF-3D Encoder, which encapsulates the complexities of 3D geometry along with signal-specific details. These features undergo multi-scale embedding to simulate the actual RF signal dissemination process. Our evaluation, based on synthetic and real-world measurements, demonstrates that Diffusion^2 accurately estimates the behavior of RF signals in various frequency bands and environmental conditions, with an error margin of just 1.9 dB and 27x faster than existing methods, marking a significant advancement in the field. Refer to https://rfvision-project.github.io/ for more information.
In-Context Learning for Pure Exploration
Russo, Alessio, Welch, Ryan, Pacchiano, Aldo
We study the problem active sequential hypothesis testing, also known as pure exploration: given a new task, the learner adaptively collects data from the environment to efficiently determine an underlying correct hypothesis. A classical instance of this problem is the task of identifying the best arm in a multi-armed bandit problem (a.k.a. BAI, Best-Arm Identification), where actions index hypotheses. Another important case is generalized search, a problem of determining the correct label through a sequence of strategically selected queries that indirectly reveal information about the label. In this work, we introduce In-Context Pure Exploration (ICPE), which meta-trains Transformers to map observation histories to query actions and a predicted hypothesis, yielding a model that transfers in-context. At inference time, ICPE actively gathers evidence on new tasks and infers the true hypothesis without parameter updates. Across deterministic, stochastic, and structured benchmarks, including BAI and generalized search, ICPE is competitive with adaptive baselines while requiring no explicit modeling of information structure. Our results support Transformers as practical architectures for general sequential testing.
Improved probabilistic regression using diffusion models
Kneissl, Carlo, Bülte, Christopher, Scholl, Philipp, Kutyniok, Gitta
Probabilistic regression models the entire predictive distribution of a response variable, offering richer insights than classical point estimates and directly allowing for uncertainty quantification. While diffusion-based generative models have shown remarkable success in generating complex, high-dimensional data, their usage in general regression tasks often lacks uncertainty-related evaluation and remains limited to domain-specific applications. We propose a novel diffusion-based framework for probabilistic regression that learns predictive distributions in a nonparametric way. More specifically, we propose to model the full distribution of the diffusion noise, enabling adaptation to diverse tasks and enhanced uncertainty quantification. We investigate different noise parameterizations, analyze their trade-offs, and evaluate our framework across a broad range of regression tasks, covering low- and high-dimensional settings. For several experiments, our approach shows superior performance against existing baselines, while delivering calibrated uncertainty estimates, demonstrating its versatility as a tool for probabilistic prediction.
Stochastic Approximation Methods for Distortion Risk Measure Optimization
Jiang, Jinyang, Heidergott, Bernd, Hu, Jiaqiao, Peng, Yijie
Distortion Risk Measures (DRMs) capture risk preferences in decision-making and serve as general criteria for managing uncertainty. This paper proposes gradient descent algorithms for DRM optimization based on two dual representations: the Distortion-Measure (DM) form and Quantile-Function (QF) form. The DM-form employs a three-timescale algorithm to track quantiles, compute their gradients, and update decision variables, utilizing the Generalized Likelihood Ratio and kernel-based density estimation. The QF-form provides a simpler two-timescale approach that avoids the need for complex quantile gradient estimation. A hybrid form integrates both approaches, applying the DM-form for robust performance around distortion function jumps and the QF-form for efficiency in smooth regions. Proofs of strong convergence and convergence rates for the proposed algorithms are provided. In particular, the DM-form achieves an optimal rate of $O(k^{-4/7})$, while the QF-form attains a faster rate of $O(k^{-2/3})$. Numerical experiments confirm their effectiveness and demonstrate substantial improvements over baselines in robust portfolio selection tasks. The method's scalability is further illustrated through integration into deep reinforcement learning. Specifically, a DRM-based Proximal Policy Optimization algorithm is developed and applied to multi-echelon dynamic inventory management, showcasing its practical applicability.
Score-based Greedy Search for Structure Identification of Partially Observed Linear Causal Models
Dong, Xinshuai, Ng, Ignavier, Dai, Haoyue, Sun, Jiaqi, Song, Xiangchen, Spirtes, Peter, Zhang, Kun
Identifying the structure of a partially observed causal system is essential to various scientific fields. Recent advances have focused on constraint-based causal discovery to solve this problem, and yet in practice these methods often face challenges related to multiple testing and error propagation. These issues could be mitigated by a score-based method and thus it has raised great attention whether there exists a score-based greedy search method that can handle the partially observed scenario. In this work, we propose the first score-based greedy search method for the identification of structure involving latent variables with identifiability guarantees. Specifically, we propose Generalized N Factor Model and establish the global consistency: the true structure including latent variables can be identified up to the Markov equivalence class by using score. We then design Latent variable Greedy Equivalence Search (LGES), a greedy search algorithm for this class of model with well-defined operators, which search very efficiently over the graph space to find the optimal structure. Our experiments on both synthetic and real-life data validate the effectiveness of our method (code will be publicly available).
A Trustworthy Industrial Fault Diagnosis Architecture Integrating Probabilistic Models and Large Language Models
Abstract: Addressing the core problem of insufficient trustworthiness in industrial fault diagnosis, stemming from the limitations of existing methods -- both traditional and deep learning - based -- in terms of interpretability, generalization, and uncertainty quantification, this paper proposes a trustworthy industrial fault diagnosis architecture, the Hierarchical Cognitive Arbitration Architecture (HCAA), which integrates probabilistic models with Large Language Models (LLMs). The architecture conducts a preliminary analysis via a diagnostic engine based on a Bayesian network and features an LLM - driven cognitive arbitration module with multimodal input capabilities. This module performs expert - level arbitration on the initial diagnosis by analyzing structured features and diagnostic charts, holding the priority to make the final decision upon detecting conflicts. To ensure the reliability of the system's output, the architecture integrates a confidence calibration module based on Temperature Scaling and a risk assessment module, which objectively quantify system trustworthiness using metrics like Expected Calibration Error (ECE). Experimental results on a dataset containing multiple fault types demonstrate that the proposed framework improves diagnostic accuracy by over 28 percentage points compared to baseline models, while the post - calibration ECE is reduced by more than 75%. Case studies confirm that the HCAA effectively corrects misjudgments from traditional models caused by complex feature patterns or knowledge gaps, providing a novel and practical engineering solution for building high - trust, explainable AI diagnostic systems for industrial applications. Keywords: Industrial Fault Diagnosis; Large Language Model (LLM); Hierarchical Cognitive Arbitration; Probabilistic Model; Confidence Calibration; Trustworthy AI 1. Introduction With the deep development of Industry 4.0 and smart manufacturing concepts, modern industrial systems are evolving towards high levels of automation and intelligence. In this process, the reliability and safety of equipment have become key factors determining production efficiency and operational costs. Prognostics and Health Management (PHM), as a core technology, plays an indispensable role in improving equipment reliability, reducing unplanned downtime, and optimizing maintenance costs by monitoring equipment status in real - time, diagnosing potential faults, and predicting remaining useful life [1], [2].
Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling
Dang, Meihua, Han, Jiaqi, Xu, Minkai, Xu, Kai, Srivastava, Akash, Ermon, Stefano
Discrete diffusion models have recently emerged as strong alternatives to autoregressive language models, matching their performance through large-scale training. However, inference-time control remains relatively underexplored. In this work, we study how to steer generation toward desired rewards without retraining the models. Prior methods typically resample or filter within a single denoising trajectory, optimizing rewards step-by-step without trajectory-level refinement. We introduce particle Gibbs sampling for diffusion language models (PG-DLM), a novel inference-time algorithm enabling trajectory-level refinement while preserving generation perplexity under reward optimization. PG-DLM constructs a Markov chain over full denoising trajectories and applies a conditional sequential Monte Carlo kernel to resample them. We derive theoretical guarantees for convergence, including asymptotic consistency and variance bounds. Within this framework, we further analyze trade-offs across four key axes for inference-time scaling under fixed budgets: iterations, samples, denoising steps, and reward estimation. Our analysis shows scaling iterations achieves the best reward-perplexity trade-off. Empirically, PG-DLM consistently outperforms prior methods using MDLM and LLaDA-8B as base models across a wide range of compute budgets for reward-guided generation tasks including toxicity and sentiment control as well as linguistic acceptability.
A fast non-reversible sampler for Bayesian finite mixture models
Ascolani, Filippo, Zanella, Giacomo
Finite mixtures are a cornerstone of Bayesian modelling, and it is well-known that sampling from the resulting posterior distribution can be a hard task. In particular, popular reversible Markov chain Monte Carlo schemes are often slow to converge when the number of observations $n$ is large. In this paper we introduce a novel and simple non-reversible sampling scheme for Bayesian finite mixture models, which is shown to drastically outperform classical samplers in many scenarios of interest, especially during convergence phase and when components in the mixture have non-negligible overlap. At the theoretical level, we show that the performance of the proposed non-reversible scheme cannot be worse than the standard one, in terms of asymptotic variance, by more than a factor of four; and we provide a scaling limit analysis suggesting that the non-reversible sampler can reduce the convergence time from O$(n^2)$ to O$(n)$. We also discuss why the statistical features of mixture models make them an ideal case for the use of non-reversible discrete samplers.
Rates of Convergence of Generalised Variational Inference Posteriors under Prior Misspecification
Mildner, Terje, Giampouras, Paris, Damoulas, Theodoros
We prove rates of convergence and robustness to prior misspecification within a Generalised Variational Inference (GVI) framework with bounded divergences. This addresses a significant open challenge for GVI and Federated GVI that employ a different divergence to the Kullback--Leibler under prior misspecification, operate within a subset of possible probability measures, and result in intractable posteriors. Our theoretical contributions cover severe prior misspecification while relying on our ability to restrict the space of possible GVI posterior measures, and infer properties based on this space. In particular, we are able to establish sufficient conditions for existence and uniqueness of GVI posteriors on arbitrary Polish spaces, prove that the GVI posterior measure concentrates on a neighbourhood of loss minimisers, and extend this to rates of convergence regardless of the prior measure.