Bayesian Learning
Conditional Independence Estimates for the Generalized Nonparanormal
Shah, Ujas, Lladser, Manuel, Morrison, Rebecca
For general non-Gaussian distributions, the covariance and precision matrices do not encode the independence structure of the variables, as they do for the multivariate Gaussian. This paper builds on previous work to show that for a class of non-Gaussian distributions -- those derived from diagonal transformations of a Gaussian -- information about the conditional independence structure can still be inferred from the precision matrix, provided the data meet certain criteria, analogous to the Gaussian case. We call such transformations of the Gaussian as the generalized nonparanormal. The functions that define these transformations are, in a broad sense, arbitrary. We also provide a simple and computationally efficient algorithm that leverages this theory to recover conditional independence structure from the generalized nonparanormal data. The effectiveness of the proposed algorithm is demonstrated via synthetic experiments and applications to real-world data.
Calibrated and uncertain? Evaluating uncertainty estimates in binary classification models
Grefsrud, Aurora, Blaser, Nello, Buanes, Trygve
Rigorous statistical methods, including parameter estimation with accompanying uncertainties, underpin the validity of scientific discovery, especially in the natural sciences. With increasingly complex data models such as deep learning techniques, uncertainty quantification has become exceedingly difficult and a plethora of techniques have been proposed. In this case study, we use the unifying framework of approximate Bayesian inference combined with empirical tests on carefully created synthetic classification datasets to investigate qualitative properties of six different probabilistic machine learning algorithms for class probability and uncertainty estimation: (i) a neural network ensemble, (ii) neural network ensemble with conflictual loss, (iii) evidential deep learning, (iv) a single neural network with Monte Carlo Dropout, (v) Gaussian process classification and (vi) a Dirichlet process mixture model. We check if the algorithms produce uncertainty estimates which reflect commonly desired properties, such as being well calibrated and exhibiting an increase in uncertainty for out-of-distribution data points. Our results indicate that all algorithms are well calibrated, but none of the deep learning based algorithms provide uncertainties that consistently reflect lack of experimental evidence for out-of-distribution data points. We hope our study may serve as a clarifying example for researchers developing new methods of uncertainty estimation for scientific data-driven modeling.
Bayesian Models for Joint Selection of Features and Auto-Regressive Lags: Theory and Applications in Environmental and Financial Forecasting
Manna, Alokesh, Ghosh, Sujit K.
We develop a Bayesian framework for variable selection in linear regression with autocorrelated errors, accommodating lagged covariates and autoregressive structures. This setting occurs in time series applications where responses depend on contemporaneous or past explanatory variables and persistent stochastic shocks, including financial modeling, hydrological forecasting, and meteorological applications requiring temporal dependency capture. Our methodology uses hierarchical Bayesian models with spike-and-slab priors to simultaneously select relevant covariates and lagged error terms. We propose an efficient two-stage MCMC algorithm separating sampling of variable inclusion indicators and model parameters to address high-dimensional computational challenges. Theoretical analysis establishes posterior selection consistency under mild conditions, even when candidate predictors grow exponentially with sample size, common in modern time series with many potential lagged variables. Through simulations and real applications (groundwater depth prediction, S&P 500 log returns modeling), we demonstrate substantial gains in variable selection accuracy and predictive performance. Compared to existing methods, our framework achieves lower MSPE, improved true model component identification, and greater robustness with autocorrelated noise, underscoring practical utility for model interpretation and forecasting in autoregressive settings.
Nonparametric learning of stochastic differential equations from sparse and noisy data
Ganguly, Arnab, Mitra, Riten, Zhou, Jinpu
The paper proposes a systematic framework for building data-driven stochastic differential equation (SDE) models from sparse, noisy observations. Unlike traditional parametric approaches, which assume a known functional form for the drift, our goal here is to learn the entire drift function directly from data without strong structural assumptions, making it especially relevant in scientific disciplines where system dynamics are partially understood or highly complex. We cast the estimation problem as minimization of the penalized negative log-likelihood functional over a reproducing kernel Hilbert space (RKHS). In the sparse observation regime, the presence of unobserved trajectory segments makes the SDE likelihood intractable. To address this, we develop an Expectation-Maximization (EM) algorithm that employs a novel Sequential Monte Carlo (SMC) method to approximate the filtering distribution and generate Monte Carlo estimates of the E-step objective. The M-step then reduces to a penalized empirical risk minimization problem in the RKHS, whose minimizer is given by a finite linear combination of kernel functions via a generalized representer theorem. To control model complexity across EM iterations, we also develop a hybrid Bayesian variant of the algorithm that uses shrinkage priors to identify significant coefficients in the kernel expansion. We establish important theoretical convergence results for both the exact and approximate EM sequences. The resulting EM-SMC-RKHS procedure enables accurate estimation of the drift function of stochastic dynamical systems in low-data regimes and is broadly applicable across domains requiring continuous-time modeling under observational constraints. We demonstrate the effectiveness of our method through a series of numerical experiments.
Learning with Confidence
We characterize a notion of confidence that arises in learning or updating beliefs: the amount of trust one has in incoming information and its impact on the belief state. This learner's confidence can be used alongside (and is easily mistaken for) probability or likelihood, but it is fundamentally a different concept -- one that captures many familiar concepts in the literature, including learning rates and number of training epochs, Shafer's weight of evidence, and Kalman gain. We formally axiomatize what it means to learn with confidence, give two canonical ways of measuring confidence on a continuum, and prove that confidence can always be represented in this way. Under additional assumptions, we derive more compact representations of confidence-based learning in terms of vector fields and loss functions. These representations induce an extended language of compound "parallel" observations. We characterize Bayes Rule as the special case of an optimizing learner whose loss representation is a linear expectation.
Risk-Based Prognostics and Health Management
Introduction As engineering fields mature, new technologies are emerging that are beginning to serve as the foundation of many societal improvements. For example, modern medical diagnostic equipment provides valuable information that gives medical professionals a better understanding of a patient's needs and ultimately improves quality of life [1]. Improvements to vehicle designs make transportation in cars or aircraft safer and more environmentally friendly [2]. Military equipment continues to be developed that better supports and protects personnel in the field [3]. Manufacturing practices and robotic equipment improve work safety conditions and reduce a product's price point, making amenities available to a wider range of consumers [4]. One approach to maximizing system availability is to incorporate some means of health assessment into the system itself. Doing so is often referred to as "integrated system health management" (ISHM) or "prognostics and health management" (PHM), which has been applied successfully to many complex systems [5]. By integrating health assessment into the very functioning of a system, more information can be obtained that provides a better understanding of the system as a whole, thus allowing system owners to become proactive in how they deal with system degradation. ISHM and PHM promise to focus on system conditions, thus supporting initiatives in what has become known as condition-based maintenance (CBM). This, in turn, enables maintenance events to be initiated based on specific system conditions rather than waiting until a failure occurs [6]. One of the key ingredients of ISHM/PHM is diagnostics, which corresponds to the process of determining the health state of the system based on sets of observations (or tests). Such tests are designed specifically to track system behavior and determine whether or not a failure has occurred. In many cases it is impossible to identify a single fault that explains the observations with certainty. Instead, candidate sets of faults are often indicated, and when using applicable models, probabilities or confidence values are associated with the faults to provide additional information. One historic approach to using test observations for diagnosis is to apply a decision tree - sometimes referred to as a fault tree1 [7].
Fusing Rewards and Preferences in Reinforcement Learning
Khorasani, Sadegh, Salehkaleybar, Saber, Kiyavash, Negar, Grossglauser, Matthias
We present Dual-Feedback Actor (DFA), a reinforcement learning algorithm that fuses both individual rewards and pairwise preferences (if available) into a single update rule. DFA uses the policy's log-probabilities directly to model the preference probability, avoiding a separate reward-modeling step. Preferences can be provided by human-annotators (at state-level or trajectory-level) or be synthesized online from Q-values stored in an off-policy replay buffer. Under a Bradley-Terry model, we prove that minimizing DFA's preference loss recovers the entropy-regularized Soft Actor-Critic (SAC) policy. Our simulation results show that DFA trained on generated preferences matches or exceeds SAC on six control environments and demonstrates a more stable training process. With only a semi-synthetic preference dataset under Bradley-Terry model, our algorithm outperforms reward-modeling reinforcement learning from human feedback (RLHF) baselines in a stochastic GridWorld and approaches the performance of an oracle with true rewards.
Group Fairness Meets the Black Box: Enabling Fair Algorithms on Closed LLMs via Post-Processing
Xian, Ruicheng, Wan, Yuxuan, Zhao, Han
Instruction fine-tuned large language models (LLMs) enable a simple zero-shot or few-shot prompting paradigm, also known as in-context learning, for building prediction models. This convenience, combined with continued advances in LLM capability, has the potential to drive their adoption across a broad range of domains, including high-stakes applications where group fairness -- preventing disparate impacts across demographic groups -- is essential. The majority of existing approaches to enforcing group fairness on LLM-based classifiers rely on traditional fair algorithms applied via model fine-tuning or head-tuning on final-layer embeddings, but they are no longer applicable to closed-weight LLMs under the in-context learning setting, which include some of the most capable commercial models today, such as GPT-4, Gemini, and Claude. In this paper, we propose a framework for deriving fair classifiers from closed-weight LLMs via prompting: the LLM is treated as a feature extractor, and features are elicited from its probabilistic predictions (e.g., token log probabilities) using prompts strategically designed for the specified fairness criterion to obtain sufficient statistics for fair classification; a fair algorithm is then applied to these features to train a lightweight fair classifier in a post-hoc manner. Experiments on five datasets, including three tabular ones, demonstrate strong accuracy-fairness tradeoffs for the classifiers derived by our framework from both open-weight and closed-weight LLMs; in particular, our framework is data-efficient and outperforms fair classifiers trained on LLM embeddings (i.e., head-tuning) or from scratch on raw tabular features.