Bayesian Learning
Inverse Reinforcement Learning using Revealed Preferences and Passive Stochastic Optimization
This monograph, spanning three chapters, explores Inverse Reinforcement Learning (IRL). The first two chapters view inverse reinforcement learning (IRL) through the lens of revealed preferences from microeconomics while the third chapter studies adaptive IRL via Langevin dynamics stochastic gradient algorithms. Chapter uses classical revealed preference theory (Afriat's theorem and extensions) to identify constrained utility maximizers based on observed agent actions. This allows for the reconstruction of set-valued estimates of an agent's utility. We illustrate this procedure by identifying the presence of a cognitive radar and reconstructing its utility function. The chapter also addresses the construction of a statistical detector for utility maximization behavior when agent actions are corrupted by noise. Chapter 2 studies Bayesian IRL. It investigates how an analyst can determine if an observed agent is a rationally inattentive Bayesian utility maximizer (i.e., simultaneously optimizing its utility and observation likelihood). The chapter discusses inverse stopping-time problems, focusing on reconstructing the continuation and stopping costs of a Bayesian agent operating over a random horizon. We then apply this IRL methodology to identify the presence of a Bayes-optimal sequential detector. Additionally, Chapter 2 provides a concise overview of discrete choice models, inverse Bayesian filtering, and inverse stochastic gradient algorithms for adaptive IRL. Finally, Chapter 3 introduces an adaptive IRL approach utilizing passive Langevin dynamics. This method aims to track time-varying utility functions given noisy and misspecified gradients. In essence, the adaptive IRL algorithms presented in Chapter 3 can be conceptualized as inverse stochastic gradient algorithms, as they learn the utility function in real-time while a stochastic gradient algorithm is in operation.
Churn-Aware Recommendation Planning under Aggregated Preference Feedback
We study a sequential decision-making problem motivated by recent regulatory and technological shifts that limit access to individual user data in recommender systems (RSs), leaving only population-level preference information. This privacy-aware setting poses fundamental challenges in planning under uncertainty: Effective personalization requires exploration to infer user preferences, yet unsatisfactory recommendations risk immediate user churn. To address this, we introduce the Rec-APC model, in which an anonymous user is drawn from a known prior over latent user types (e.g., personas or clusters), and the decision-maker sequentially selects items to recommend. Feedback is binary -- positive responses refine the posterior via Bayesian updates, while negative responses result in the termination of the session. We prove that optimal policies converge to pure exploitation in finite time and propose a branch-and-bound algorithm to efficiently compute them. Experiments on synthetic and MovieLens data confirm rapid convergence and demonstrate that our method outperforms the POMDP solver SARSOP, particularly when the number of user types is large or comparable to the number of content categories. Our results highlight the applicability of this approach and inspire new ways to improve decision-making under the constraints imposed by aggregated preference data.
Bayesian Multiobject Tracking With Neural-Enhanced Motion and Measurement Models
Wei, Shaoxiu, Liang, Mingchao, Meyer, Florian
Multiobject tracking (MOT) is an important task in applications including autonomous driving, ocean sciences, and aerospace surveillance. Traditional MOT methods are model-based and combine sequential Bayesian estimation with data association and an object birth model. More recent methods are fully data-driven and rely on the training of neural networks. Both approaches offer distinct advantages in specific settings. In particular, model-based methods are generally applicable across a wide range of scenarios, whereas data-driven MOT achieves superior performance in scenarios where abundant labeled data for training is available. A natural thought is whether a general framework can integrate the two approaches. This paper introduces a hybrid method that utilizes neural networks to enhance specific aspects of the statistical model in Bayesian MOT that have been identified as overly simplistic. By doing so, the performance of the prediction and update steps of Bayesian MOT is improved. To ensure tractable computation, our framework uses belief propagation to avoid high-dimensional operations combined with sequential Monte Carlo methods to perform low-dimensional operations efficiently. The resulting method combines the flexibility and robustness of model-based approaches with the capability to learn complex information from data of neural networks. We evaluate the performance of the proposed method based on the nuScenes autonomous driving dataset and demonstrate that it has state-of-the-art performance
OBSER: Object-Based Sub-Environment Recognition for Zero-Shot Environmental Inference
Choi, Won-Seok, Han, Dong-Sig, Choi, Suhyung, Yang, Hyeonseo, Zhang, Byoung-Tak
W e present the Object-Based Sub-Environment Recognition (OBSER) framework, a novel Bayesian framework that infers three fundamental relationships between sub-environments and their constituent objects. In the OBSER framework, metric and self-supervised learning models estimate the object distributions of sub-environments on the latent space to compute these measures. Both theoretically and empirically, we validate the proposed framework by introducing the ( ϵ, δ) statistically separable (EDS) function which indicates the alignment of the representation. Our framework reliably performs inference in open-world and photorealistic environments and outperforms scene-based methods in chained retrieval tasks. The OBSER framework enables zero-shot recognition of environments to achieve autonomous environment understanding.
Return of the Latent Space COWBOYS: Re-thinking the use of VAEs for Bayesian Optimisation of Structured Spaces
Moss, Henry B., Ober, Sebastian W., Diethe, Tom
Bayesian optimisation in the latent space of a Variational AutoEncoder (VAE) is a powerful framework for optimisation tasks over complex structured domains, such as the space of scientifically interesting molecules. However, existing approaches tightly couple the surrogate and generative models, which can lead to suboptimal performance when the latent space is not tailored to specific tasks, which in turn has led to the proposal of increasingly sophisticated algorithms. In this work, we explore a new direction, instead proposing a decoupled approach that trains a generative model and a Gaussian Process (GP) surrogate separately, then combines them via a simple yet principled Bayesian update rule. This separation allows each component to focus on its strengths -- structure generation from the VAE and predictive modelling by the GP. We show that our decoupled approach improves our ability to identify high-potential candidates in molecular optimisation problems under constrained evaluation budgets.
Normalizing Flow to Augmented Posterior: Conditional Density Estimation with Interpretable Dimension Reduction for High Dimensional Data
Zeng, Cheng, Michailidis, George, Iyatomi, Hitoshi, Duan, Leo L
The conditional density characterizes the distribution of a response variable $y$ given other predictor $x$, and plays a key role in many statistical tasks, including classification and outlier detection. Although there has been abundant work on the problem of Conditional Density Estimation (CDE) for a low-dimensional response in the presence of a high-dimensional predictor, little work has been done for a high-dimensional response such as images. The promising performance of normalizing flow (NF) neural networks in unconditional density estimation acts a motivating starting point. In this work, we extend NF neural networks when external $x$ is present. Specifically, they use the NF to parameterize a one-to-one transform between a high-dimensional $y$ and a latent $z$ that comprises two components \([z_P,z_N]\). The $z_P$ component is a low-dimensional subvector obtained from the posterior distribution of an elementary predictive model for $x$, such as logistic/linear regression. The $z_N$ component is a high-dimensional independent Gaussian vector, which explains the variations in $y$ not or less related to $x$. Unlike existing CDE methods, the proposed approach, coined Augmented Posterior CDE (AP-CDE), only requires a simple modification on the common normalizing flow framework, while significantly improving the interpretation of the latent component, since $z_P$ represents a supervised dimension reduction. In image analytics applications, AP-CDE shows good separation of $x$-related variations due to factors such as lighting condition and subject id, from the other random variations. Further, the experiments show that an unconditional NF neural network, based on an unsupervised model of $z$, such as Gaussian mixture, fails to generate interpretable results.
Generative Regression with IQ-BART
O'Hagan, Sean, Ročková, Veronika
Implicit Quantile BART (IQ-BART) posits a non-parametric Bayesian model on the conditional quantile function, acting as a model over a conditional model for $Y$ given $X$. One of the key ingredients is augmenting the observed data $\{(Y_i,X_i)\}_{i=1}^n$ with uniformly sampled values $τ_i$ for $1\leq i\leq n$ which serve as training data for quantile function estimation. Using the fact that the location parameter $μ$ in a $τ$-tilted asymmetric Laplace distribution corresponds to the $τ^{th}$ quantile, we build a check-loss likelihood targeting $μ$ as the parameter of interest. We equip the check-loss likelihood parametrized by $μ=f(X,τ)$ with a BART prior on $f(\cdot)$, allowing the conditional quantile function to vary both in $X$ and $τ$. The posterior distribution over $μ(τ,X)$ can be then distilled for estimation of the {\em entire quantile function} as well as for assessing uncertainty through the variation of posterior draws. Simulation-based predictive inference is immediately available through inverse transform sampling using the learned quantile function. The sum-of-trees structure over the conditional quantile function enables flexible distribution-free regression with theoretical guarantees. As a byproduct, we investigate posterior mean quantile estimator as an alternative to the routine sample (posterior mode) quantile estimator. We demonstrate the power of IQ-BART on time series forecasting datasets where IQ-BART can capture multimodality in predictive distributions that might be otherwise missed using traditional parametric approaches.
Model selection for stochastic dynamics: a parsimonious and principled approach
This thesis focuses on the discovery of stochastic differential equations (SDEs) and stochastic partial differential equations (SPDEs) from noisy and discrete time series. A major challenge is selecting the simplest possible correct model from vast libraries of candidate models, where standard information criteria (AIC, BIC) are often limited. We introduce PASTIS (Parsimonious Stochastic Inference), a new information criterion derived from extreme value theory. Its penalty term, $n_\mathcal{B} \ln(n_0/p)$, explicitly incorporates the size of the initial library of candidate parameters ($n_0$), the number of parameters in the considered model ($n_\mathcal{B}$), and a significance threshold ($p$). This significance threshold represents the probability of selecting a model containing more parameters than necessary when comparing many models. Benchmarks on various systems (Lorenz, Ornstein-Uhlenbeck, Lotka-Volterra for SDEs; Gray-Scott for SPDEs) demonstrate that PASTIS outperforms AIC, BIC, cross-validation (CV), and SINDy (a competing method) in terms of exact model identification and predictive capability. Furthermore, real-world data can be subject to large sampling intervals ($Δt$) or measurement noise ($σ$), which can impair model learning and selection capabilities. To address this, we have developed robust variants of PASTIS, PASTIS-$Δt$ and PASTIS-$σ$, thus extending the applicability of the approach to imperfect experimental data. PASTIS thus provides a statistically grounded, validated, and practical methodological framework for discovering simple models for processes with stochastic dynamics.
Transfer Learning in Infinite Width Feature Learning Networks
Lauditi, Clarissa, Bordelon, Blake, Pehlevan, Cengiz
We develop a theory of transfer learning in infinitely wide neural networks where both the pretraining (source) and downstream (target) task can operate in a feature learning regime. We analyze both the Bayesian framework, where learning is described by a posterior distribution over the weights, and gradient flow training of randomly initialized networks trained with weight decay. Both settings track how representations evolve in both source and target tasks. The summary statistics of these theories are adapted feature kernels which, after transfer learning, depend on data and labels from both source and target tasks. Reuse of features during transfer learning is controlled by an elastic weight coupling which controls the reliance of the network on features learned during training on the source task. We apply our theory to linear and polynomial regression tasks as well as real datasets. Our theory and experiments reveal interesting interplays between elastic weight coupling, feature learning strength, dataset size, and source and target task alignment on the utility of transfer learning.
AL-SPCE -- Reliability analysis for nondeterministic models using stochastic polynomial chaos expansions and active learning
Pires, A., Moustapha, M., Marelli, S., Sudret, B.
Reliability analysis typically relies on deterministic simulators, which yield repeatable outputs for identical inputs. However, many real-world systems display intrinsic randomness, requiring stochastic simulators whose outputs are random variables. This inherent variability must be accounted for in reliability analysis. While Monte Carlo methods can handle this, their high computational cost is often prohibitive. To address this, stochastic emulators have emerged as efficient surrogate models capable of capturing the random response of simulators at reduced cost. Although promising, current methods still require large training sets to produce accurate reliability estimates, which limits their practicality for expensive simulations. This work introduces an active learning framework to further reduce the computational burden of reliability analysis using stochastic emulators. We focus on stochastic polynomial chaos expansions (SPCE) and propose a novel learning function that targets regions of high predictive uncertainty relevant to failure probability estimation. To quantify this uncertainty, we exploit the asymptotic normality of the maximum likelihood estimator. The resulting method, named active learning stochastic polynomial chaos expansions (AL-SPCE), is applied to three test cases. Results demonstrate that AL-SPCE maintains high accuracy in reliability estimates while significantly improving efficiency compared to conventional surrogate-based methods and direct Monte Carlo simulation. This confirms the potential of active learning in enhancing the practicality of stochastic reliability analysis for complex, computationally expensive models.