Bayesian Learning
Super-fast rates of convergence for Neural Networks Classifiers under the Hard Margin Condition
Tepakbong, Nathanael, Zhou, Ding-Xuan, Zhou, Xiang
We study the classical binary classification problem for hypothesis spaces of Deep Neural Networks (DNNs) with ReLU activation under Tsybakov's low-noise condition with exponent $q>0$, and its limit-case $q\to\infty$ which we refer to as the "hard-margin condition". We show that DNNs which minimize the empirical risk with square loss surrogate and $\ell_p$ penalty can achieve finite-sample excess risk bounds of order $\mathcal{O}\left(n^{-α}\right)$ for arbitrarily large $α>0$ under the hard-margin condition, provided that the regression function $η$ is sufficiently smooth. The proof relies on a novel decomposition of the excess risk which might be of independent interest.
High-dimensional Bayesian Tobit regression for censored response with Horseshoe prior
Censored response variables--where outcomes are only partially observed due to known bounds--arise in numerous scientific domains and present serious challenges for regression analysis. The Tobit model, a classical solution for handling left-censoring, has been widely used in economics and beyond. However, with the increasing prevalence of high-dimensional data, where the number of covariates exceeds the sample size, traditional Tobit methods become inadequate. While frequentist approaches for high-dimensional Tobit regression have recently been developed, notably through Lasso-based estimators, the Bayesian literature remains sparse and lacks theoretical guarantees. In this work, we propose a novel Bayesian framework for high-dimensional Tobit regression that addresses both censoring and sparsity. Our method leverages the Horseshoe prior to induce shrinkage and employs a data augmentation strategy to facilitate efficient posterior computation via Gibbs sampling. We establish posterior consistency and derive concentration rates under sparsity, providing the first theoretical results for Bayesian Tobit models in high dimensions. Numerical experiments show that our approach outperforms favorably with the recent Lasso-Tobit method. Our method is implemented in the R package tobitbayes, which can be found on Github.
Bayesian Estimation of Causal Effects Using Proxies of a Latent Interference Network
Network interference occurs when treatments assigned to some units affect the outcomes of others. Traditional approaches often assume that the observed network correctly specifies the interference structure. However, in practice, researchers frequently only have access to proxy measurements of the interference network due to limitations in data collection or potential mismatches between measured networks and actual interference pathways. In this paper, we introduce a framework for estimating causal effects when only proxy networks are available. Our approach leverages a structural causal model that accommodates diverse proxy types, including noisy measurements, multiple data sources, and multilayer networks, and defines causal effects as interventions on population-level treatments. Since the true interference network is latent, estimation poses significant challenges. To overcome them, we develop a Bayesian inference framework. We propose a Block Gibbs sampler with Locally Informed Proposals to update the latent network, thereby efficiently exploring the high-dimensional posterior space composed of both discrete and continuous parameters. We illustrate the performance of our method through numerical experiments, demonstrating its accuracy in recovering causal effects even when only proxies of the interference network are available.
Diffusion-based supervised learning of generative models for efficient sampling of multimodal distributions
Tran, Hoang, Zhang, Zezhong, Bao, Feng, Lu, Dan, Zhang, Guannan
We propose a hybrid generative model for efficient sampling of high-dimensional, multimodal probability distributions for Bayesian inference. Traditional Monte Carlo methods, such as the Metropolis-Hastings and Langevin Monte Carlo sampling methods, are effective for sampling from single-mode distributions in high-dimensional spaces. However, these methods struggle to produce samples with the correct proportions for each mode in multimodal distributions, especially for distributions with well separated modes. To address the challenges posed by multimodality, we adopt a divide-and-conquer strategy. We start by minimizing the energy function with initial guesses uniformly distributed within the prior domain to identify all the modes of the energy function. Then, we train a classifier to segment the domain corresponding to each mode. After the domain decomposition, we train a diffusion-model-assisted generative model for each identified mode within its support. Once each mode is characterized, we employ bridge sampling to estimate the normalizing constant, allowing us to directly adjust the ratios between the modes. Our numerical examples demonstrate that the proposed framework can effectively handle multimodal distributions with varying mode shapes in up to 100 dimensions. An application to Bayesian inverse problem for partial differential equations is also provided.
ConDiSim: Conditional Diffusion Models for Simulation Based Inference
Nautiyal, Mayank, Hellander, Andreas, Singh, Prashant
Statistical inference of model parameters from empirical observations is a fundamental challenge in scientific research, enabling researchers to derive meaningful insights from complex simulation models. These parameters govern the behavior of simulators that replicate real-world phenomena, providing a bridge between theoretical constructs and empirical observations [Lavin et al., 2021]. Calibrating these parameters to ensure that simulator outputs align with observed data constitutes an inverse problem, formally defined within the framework of simulation-based inference (SBI) [Cranmer et al., 2020]. Solving this inverse problem involves addressing uncertainties arising from model stochasticity and potential multi-valuedness, where different sets of parameter values can produce similar observations or similar parameters may lead to varied outputs. Additionally, parameter inference becomes increasingly complex when simulators operate as'black boxes' with intractable likelihood functions, rendering traditional likelihood-based Bayesian methods impractical [Sisson et al., 2018].
Contrastive Normalizing Flows for Uncertainty-Aware Parameter Estimation
Elsharkawy, Ibrahim, Kahn, Yonatan
Estimating physical parameters from data is a crucial application of machine learning (ML) in the physical sciences. However, systematic uncertainties, such as detector miscalibration, induce data distribution distortions that can erode statistical precision. In both high-energy physics (HEP) and broader ML contexts, achieving uncertainty-aware parameter estimation under these domain shifts remains an open problem. In this work, we address this challenge of uncertainty-aware parameter estimation for a broad set of tasks critical for HEP. We introduce a novel approach based on Contrastive Normalizing Flows (CNFs), which achieves top performance on the HiggsML Uncertainty Challenge dataset. Building on the insight that a binary classifier can approximate the model parameter likelihood ratio, we address the practical limitations of expressivity and the high cost of simulating high-dimensional parameter grids by embedding data and parameters in a learned CNF mapping. This mapping yields a tunable contrastive distribution that enables robust classification under shifted data distributions. Through a combination of theoretical analysis and empirical evaluations, we demonstrate that CNFs, when coupled with a classifier and established frequentist techniques, provide principled parameter estimation and uncertainty quantification through classification that is robust to data distribution distortions.
Modular Federated Learning: A Meta-Framework Perspective
Vicente, Frederico, Soares, Cláudia, Jakovetić, Dušan
Federated Learning (FL) enables distributed machine learning training while preserving privacy, representing a paradigm shift for data-sensitive and decentralized environments. Despite its rapid advancements, FL remains a complex and multifaceted field, requiring a structured understanding of its methodologies, challenges, and applications. In this survey, we introduce a meta-framework perspective, conceptualising FL as a composition of modular components that systematically address core aspects such as communication, optimisation, security, and privacy. We provide a historical contextualisation of FL, tracing its evolution from distributed optimisation to modern distributed learning paradigms. Additionally, we propose a novel taxonomy distinguishing Aggregation from Alignment, introducing the concept of alignment as a fundamental operator alongside aggregation. To bridge theory with practice, we explore available FL frameworks in Python, facilitating real-world implementation. Finally, we systematise key challenges across FL sub-fields, providing insights into open research questions throughout the meta-framework modules. By structuring FL within a meta-framework of modular components and emphasising the dual role of Aggregation and Alignment, this survey provides a holistic and adaptable foundation for understanding and advancing FL research and deployment.
A Sparse Bayesian Learning Algorithm for Estimation of Interaction Kernels in Motsch-Tadmor Model
In this paper, we investigate the data-driven identification of asymmetric interaction kernels in the Motsch-Tadmor model based on observed trajectory data. The model under consideration is governed by a class of semilinear evolution equations, where the interaction kernel defines a normalized, state-dependent Laplacian operator that governs collective dynamics. To address the resulting nonlinear inverse problem, we propose a variational framework that reformulates kernel identification using the implicit form of the governing equations, reducing it to a subspace identification problem. We establish an iden-tifiability result that characterizes conditions under which the interaction kernel can be uniquely recovered up to scale. To solve the inverse problem robustly, we develop a sparse Bayesian learning algorithm that incorporates informative priors for regularization, quantifies uncertainty, and enables principled model selection. Extensive numerical experiments on representative interacting particle systems demonstrate the accuracy, robustness, and interpretability of the proposed framework across a range of noise levels and data regimes.
Generalization Bounds and Stopping Rules for Learning with Self-Selected Data
Rodemann, Julian, Bailie, James
Many learning paradigms self-select training data in light of previously learned parameters. Examples include active learning, semi-supervised learning, bandits, or boosting. Rodemann et al. (2024) unify them under the framework of "reciprocal learning". In this article, we address the question of how well these methods can generalize from their self-selected samples. In particular, we prove universal generalization bounds for reciprocal learning using covering numbers and Wasserstein ambiguity sets. Our results require no assumptions on the distribution of self-selected data, only verifiable conditions on the algorithms. We prove results for both convergent and finite iteration solutions. The latter are anytime valid, thereby giving rise to stopping rules for a practitioner seeking to guarantee the out-of-sample performance of their reciprocal learning algorithm. Finally, we illustrate our bounds and stopping rules for reciprocal learning's special case of semi-supervised learning.
A Data-Driven Probabilistic Framework for Cascading Urban Risk Analysis Using Bayesian Networks
Kumar, Chunduru Rohith, Shanmuk, PHD Surya, Srinivas, Prabhala Naga, Lankalapalli, Sri Venkatesh, Dwibedy, Debasis
The increasing complexity of cascading risks in urban systems necessitates robust, data-driven frameworks to model interdependencies across multiple domains. This study presents a foundational Bayesian network-based approach for analyzing cross-domain risk propagation across key urban domains, including air, water, electricity, agriculture, health, infrastructure, weather, and climate. Directed Acyclic Graphs (DAGs) are constructed using Bayesian Belief Networks (BBNs), with structure learning guided by Hill-Climbing search optimized through Bayesian Information Criterion (BIC) and K2 scoring. The framework is trained on a hybrid dataset that combines real-world urban indicators with synthetically generated data from Generative Adversarial Networks (GANs), and is further balanced using the Synthetic Minority Over-sampling Technique (SMOTE). Conditional Probability Tables (CPTs) derived from the learned structures enable interpretable probabilistic reasoning and quantify the likelihood of cascading failures. The results identify key intra- and inter-domain risk factors and demonstrate the framework's utility for proactive urban resilience planning. This work establishes a scalable, interpretable foundation for cascading risk assessment and serves as a basis for future empirical research in this emerging interdisciplinary field.