Bayesian Inference
Bayesian Quantum Orthogonal Neural Networks for Anomaly Detection
Mathur, Natansh, Coyle, Brian, Jain, Nishant, Raj, Snehal, Tandon, Akshat, Krauser, Jasper Simon, Stoessel, Rainer
Identification of defects or anomalies in 3D objects is a crucial task to ensure correct functionality. In this work, we combine Bayesian learning with recent developments in quantum and quantum-inspired machine learning, specifically orthogonal neural networks, to tackle this anomaly detection problem for an industrially relevant use case. Bayesian learning enables uncertainty quantification of predictions, while orthogonality in weight matrices enables smooth training. We develop orthogonal (quantum) versions of 3D convolutional neural networks and show that these models can successfully detect anomalies in 3D objects. To test the feasibility of incorporating quantum computers into a quantum-enhanced anomaly detection pipeline, we perform hardware experiments with our models on IBM's 127-qubit Brisbane device, testing the effect of noise and limited measurement shots.
A Dictionary of Closed-Form Kernel Mean Embeddings
Briol, François-Xavier, Gessner, Alexandra, Karvonen, Toni, Mahsereci, Maren
Kernel mean embeddings -- integrals of a kernel with respect to a probability distribution -- are essential in Bayesian quadrature, but also widely used in other computational tools for numerical integration or for statistical inference based on the maximum mean discrepancy. These methods often require, or are enhanced by, the availability of a closed-form expression for the kernel mean embedding. However, deriving such expressions can be challenging, limiting the applicability of kernel-based techniques when practitioners do not have access to a closed-form embedding. This paper addresses this limitation by providing a comprehensive dictionary of known kernel mean embeddings, along with practical tools for deriving new embeddings from known ones. We also provide a Python library that includes minimal implementations of the embeddings.
Factor Analysis with Correlated Topic Model for Multi-Modal Data
Łazęcka, Małgorzata, Szczurek, Ewa
Integrating various data modalities brings valuable insights into underlying phenomena. Multimodal factor analysis (FA) uncovers shared axes of variation underlying different simple data modalities, where each sample is represented by a vector of features. However, FA is not suited for structured data modalities, such as text or single cell sequencing data, where multiple data points are measured per each sample and exhibit a clustering structure. To overcome this challenge, we introduce FACTM, a novel, multi-view and multi-structure Bayesian model that combines FA with correlated topic modeling and is optimized using variational inference. Additionally, we introduce a method for rotating latent factors to enhance interpretability with respect to binary features. On text and video benchmarks as well as real-world music and COVID-19 datasets, we demonstrate that FACTM outperforms other methods in identifying clusters in structured data, and integrating them with simple modalities via the inference of shared, interpretable factors.
A Unified MDL-based Binning and Tensor Factorization Framework for PDF Estimation
Musab, Mustafa, Chege, Joseph K., Yeredor, Arie, Haardt, Martin
Reliable density estimation is fundamental for numerous applications in statistics and machine learning. In many practical scenarios, data are best modeled as mixtures of component densities that capture complex and multimodal patterns. However, conventional density estimators based on uniform histograms often fail to capture local variations, especially when the underlying distribution is highly nonuniform. Furthermore, the inherent discontinuity of histograms poses challenges for tasks requiring smooth derivatives, such as gradient-based optimization, clustering, and nonparametric discriminant analysis. In this work, we present a novel non-parametric approach for multivariate probability density function (PDF) estimation that utilizes minimum description length (MDL)-based binning with quantile cuts. Our approach builds upon tensor factorization techniques, leveraging the canonical polyadic decomposition (CPD) of a joint probability tensor. We demonstrate the effectiveness of our method on synthetic data and a challenging real dry bean classification dataset.
Numerical Generalized Randomized Hamiltonian Monte Carlo for piecewise smooth target densities
Tran, Jimmy Huy, Kleppe, Tore Selland
Traditional gradient-based sampling methods, like standard Hamiltonian Monte Carlo, require that the desired target distribution is continuous and differentiable. This limits the types of models one can define, although the presented models capture the reality in the observations better. In this project, Generalized Randomized Hamiltonian Monte Carlo (GRHMC) processes for sampling continuous densities with discontinuous gradient and piecewise smooth targets are proposed. The methods combine the advantages of Hamiltonian Monte Carlo methods with the nature of continuous time processes in the form of piecewise deterministic Markov processes to sample from such distributions. It is argued that the techniques lead to GRHMC processes that admit the desired target distribution as the invariant distribution in both scenarios. Simulation experiments verifying this fact and several relevant real-life models are presented, including a new parameterization of the spike and slab prior for regularized linear regression that returns sparse coefficient estimates and a regime switching volatility model.
Doubly Adaptive Social Learning
Carpentiero, Marco, Bordignon, Virginia, Matta, Vincenzo, Sayed, Ali H.
In social learning, a network of agents assigns probability scores (beliefs) to some hypotheses of interest, which rule the generation of local streaming data observed by each agent. Belief formation takes place by means of an iterative two-step procedure where: i) the agents update locally their beliefs by using some likelihood model; and ii) the updated beliefs are combined with the beliefs of the neighboring agents, using a pooling rule. This procedure can fail to perform well in the presence of dynamic drifts, leading the agents to incorrect decision making. Here, we focus on the fully online setting where both the true hypothesis and the likelihood models can change over time. This goal is achieved by exploiting two adaptation stages: i) a stochastic gradient descent update to learn and track the drifts in the decision model; ii) and an adaptive belief update to track the true hypothesis changing over time. These stages are controlled by two adaptation parameters that govern the evolution of the error probability for each agent. We show that all agents learn consistently for sufficiently small adaptation parameters, in the sense that they ultimately place all their belief mass on the true hypothesis. Index T erms Social learning, belief formation, decision making, distributed optimization, online leaerning, opinion diffusion over graphs. Marco Carpentiero and Vincenzo Matta are with the Department of Information and Electrical Engineering and Applied Mathematics (DIEM), University of Salerno, via Giovanni Paolo II, I-84084, Fisciano (SA), Italy, and Vincenzo Matta is also with the National Inter-University Consortium for Telecommunications (CNIT), Italy (e-mails: { mcarpentiero, vmatta }@unisa.it). Matta was partially supported by the European Union under the Italian National Recovery and Resilience Plan (NRRP) of NextGenerationEU, partnership on "Telecommunications of the Future" (PE00000001 - program "REST ART"). This work was produced while Virginia Bordignon was a post-doc with the Ecole Polytechnique F ed erale de Lausanne EPFL, School of Engineering, CH-1015 Lausanne, Switzerland (e-mail: virginia.bordignon@alumni.epfl.ch).
A Robust Model-Based Approach for Continuous-Time Policy Evaluation with Unknown Lévy Process Dynamics
Ye, Qihao, Tian, Xiaochuan, Zhu, Yuhua
This paper develops a model-based framework for continuous-time policy evaluation (CTPE) in reinforcement learning, incorporating both Brownian and L evy noise to model stochastic dynamics influenced by rare and extreme events. Our approach formulates the policy evaluation problem as solving a partial integro-differential equation (PIDE) for the value function with unknown coefficients. A key challenge in this setting is accurately recovering the unknown coefficients in the stochastic dynamics, particularly when driven by L evy processes with heavy tail effects. To address this, we propose a robust numerical approach that effectively handles both unbiased and censored trajectory datasets. This method combines maximum likelihood estimation with an iterative tail correction mechanism, improving the stability and accuracy of coefficient recovery. Additionally, we establish a theoretical bound for the policy evaluation error based on coefficient recovery error. Through numerical experiments, we demonstrate the effectiveness and robustness of our method in recovering heavy-tailed L evy dynamics and verify the theoretical error analysis in policy evaluation.
Evaluating Uncertainty in Deep Gaussian Processes
van der Lende, Matthijs, Ferrao, Jeremias Lino, Müller-Hof, Niclas
Reliable uncertainty estimates are crucial in modern machine learning. Deep Gaussian Processes (DGPs) and Deep Sigma Point Processes (DSPPs) extend GPs hierarchically, offering promising methods for uncertainty quantification grounded in Bayesian principles. However, their empirical calibration and robustness under distribution shift relative to baselines like Deep Ensembles remain understudied. This work evaluates these models on regression (CASP dataset) and classification (ESR dataset) tasks, assessing predictive performance (MAE, Accu- racy), calibration using Negative Log-Likelihood (NLL) and Expected Calibration Error (ECE), alongside robustness under various synthetic feature-level distribution shifts. Results indicate DSPPs provide strong in-distribution calibration leveraging their sigma point approximations. However, compared to Deep Ensembles, which demonstrated superior robustness in both per- formance and calibration under the tested shifts, the GP-based methods showed vulnerabilities, exhibiting particular sensitivity in the observed metrics. Our findings underscore ensembles as a robust baseline, suggesting that while deep GP methods offer good in-distribution calibration, their practical robustness under distribution shift requires careful evaluation. To facilitate reproducibility, we make our code available at https://github.com/matthjs/xai-gp.
Mining Software Repositories for Expert Recommendation
Marshall, Chad, Barovic, Andrew, Moin, Armin
--We propose an automated approach to bug assignment to developers in large open-source software projects. This way, we assist human bug triagers who are in charge of finding the best developer with the right level of expertise in a particular area to be assigned to a newly reported issue. Our approach is based on the history of software development as documented in the issue tracking systems. Our approach works based on the bug reports' features, such as the corresponding products and components, as well as their priority and severity levels. We sort developers based on their experience with specific combinations of new reports. The evaluation is performed using T op-k accuracy, and the results are compared with the reported results in prior work, namely T opicMiner MTM, BUGZIE, Bug triaging via deep Reinforcement Learning BT -RL, and LDA-SVM. The evaluation data come from various Eclipse and Mozilla projects, such as JDT, Firefox, and Thunderbird. Large open-source projects offer an issue tracking system or open bug repository, where developers and users can report the software defects they find or any new feature requests they may have. These reports are called bug reports or issues . In some cases, developers can volunteer to work on the reported issues they find interesting or relevant to their field of expertise. Additionally, they sometimes report issues and assign them to themselves. However, in many cases, particularly in large open-source projects, a group of developers, called bug triagers, decide who should process and fix a newly reported issue.
Learning Energy-Based Generative Models via Potential Flow: A Variational Principle Approach to Probability Density Homotopy Matching
Loo, Junn Yong, Adeline, Michelle, Lau, Julia Kaiwen, Leong, Fang Yu, Tew, Hwa Hui, Pal, Arghya, Baskaran, Vishnu Monn, Ting, Chee-Ming, Phan, Raphaël C. -W.
Energy-based models (EBMs) are a powerful class of probabilistic generative models due to their flexibility and interpretability. However, relationships between potential flows and explicit EBMs remain underexplored, while contrastive divergence training via implicit Markov chain Monte Carlo (MCMC) sampling is often unstable and expensive in high-dimensional settings. In this paper, we propose Variational Potential Flow Bayes (VPFB), a new energy-based generative framework that eliminates the need for implicit MCMC sampling and does not rely on auxiliary networks or cooperative training. VPFB learns an energy-parameterized potential flow by constructing a flow-driven density homotopy that is matched to the data distribution through a variational loss minimizing the Kullback-Leibler divergence between the flow-driven and marginal homotopies. This principled formulation enables robust and efficient generative modeling while preserving the interpretability of EBMs. Experimental results on image generation, interpolation, out-of-distribution detection, and compositional generation confirm the effectiveness of VPFB, showing that our method performs competitively with existing approaches in terms of sample quality and versatility across diverse generative modeling tasks. 1 1 Introduction