Bayesian Learning
PASS-GLM: polynomial approximate sufficient statistics for scalable Bayesian GLM inference
Generalized linear models (GLMs)--such as logistic regression, Poisson regression, and robust regression--provide interpretable models for diverse data types. Probabilistic approaches, particularly Bayesian ones, allow coherent estimates of uncertainty, incorporation of prior information, and sharing of power across experiments via hierarchical models. In practice, however, the approximate Bayesian methods necessary for inference have either failed to scale to large data sets or failed to provide theoretical guarantees on the quality of inference. We propose a new approach based on constructing polynomial approximate sufficient statistics for GLMs (P ASS-GLM). We demonstrate that our method admits a simple algorithm as well as trivial streaming and distributed extensions that do not compound error across computations. We provide theoretical guarantees on the quality of point (MAP) estimates, the approximate posterior, and posterior mean and uncertainty estimates. We validate our approach empirically in the case of logistic regression using a quadratic approximation and show competitive performance with stochastic gradient descent, MCMC, and the Laplace approximation in terms of speed and multiple measures of accuracy--including on an advertising data set with 40 million data points and 20,000 covariates.
Scalable Levy Process Priors for Spectral Kernel Learning
Phillip A. Jang, Andrew Loeb, Matthew Davidow, Andrew G. Wilson
Gaussian processes are rich distributions over functions, with generalization properties determined by a kernel function. When used for long-range extrapolation, predictions are particularly sensitive to the choice of kernel parameters. It is therefore critical to account for kernel uncertainty in our predictive distributions. We propose a distribution over kernels formed by modelling a spectral mixture density with a L evy process. The resulting distribution has support for all stationary covariances--including the popular RBF, periodic, and Mat ern kernels-- combined with inductive biases which enable automatic and data efficient learning, long-range extrapolation, and state of the art predictive performance. The proposed model also presents an approach to spectral regularization, as the L evy process introduces a sparsity-inducing prior over mixture components, allowing automatic selection over model order and pruning of extraneous components. We exploit the algebraic structure of the proposed process for O (n) training and O (1) predictions. We perform extrapolations having reasonable uncertainty estimates on several benchmarks, show that the proposed model can recover flexible ground truth covariances and that it is robust to errors in initialization.
Constructing Deep Neural Networks by Bayesian Network Structure Learning
We introduce a principled approach for unsupervised structure learning of deep neural networks. We propose a new interpretation for depth and inter-layer connectivity where conditional independencies in the input distribution are encoded hierarchically in the network structure. Thus, the depth of the network is determined inherently. The proposed method casts the problem of neural network structure learning as a problem of Bayesian network structure learning. Then, instead of directly learning the discriminative structure, it learns a generative graph, constructs its stochastic inverse, and then constructs a discriminative graph. We prove that conditional-dependency relations among the latent variables in the generative graph are preserved in the class-conditional discriminative graph. We demonstrate on image classification benchmarks that the deepest layers (convolutional and dense) of common networks can be replaced by significantly smaller learned structures, while maintaining classification accuracy---state-of-the-art on tested benchmarks. Our structure learning algorithm requires a small computational cost and runs efficiently on a standard desktop CPU.
MedBayes-Lite: Bayesian Uncertainty Quantification for Safe Clinical Decision Support
Hossain, Elias, Nipu, Md Mehedi Hasan, Sheikh, Maleeha, Rana, Rajib, Neupane, Subash, Yousefi, Niloofar
We propose MedBayes-Lite, a lightweight Bayesian enhancement for transformer-based clinical language models designed to produce reliable, uncertainty-aware predictions. Although transformers show strong potential for clinical decision support, they remain prone to overconfidence, especially in ambiguous medical cases where calibrated uncertainty is critical. MedBayes-Lite embeds uncertainty quantification directly into existing transformer pipelines without any retraining or architectural rewiring, adding no new trainable layers and keeping parameter overhead under 3 percent. The framework integrates three components: (i) Bayesian Embedding Calibration using Monte Carlo dropout for epistemic uncertainty, (ii) Uncertainty-Weighted Attention that marginalizes over token reliability, and (iii) Confidence-Guided Decision Shaping inspired by clinical risk minimization. Across biomedical QA and clinical prediction benchmarks (MedQA, PubMedQA, MIMIC-III), MedBayes-Lite consistently improves calibration and trustworthiness, reducing overconfidence by 32 to 48 percent. In simulated clinical settings, it can prevent up to 41 percent of diagnostic errors by flagging uncertain predictions for human review. These results demonstrate its effectiveness in enabling reliable uncertainty propagation and improving interpretability in medical AI systems.
gfnx: Fast and Scalable Library for Generative Flow Networks in JAX
Tiapkin, Daniil, Agarkov, Artem, Morozov, Nikita, Maksimov, Ian, Tsyganov, Askar, Gritsaev, Timofei, Samsonov, Sergey
In this paper, we present gfnx, a fast and scalable package for training and evaluating Generative Flow Networks (GFlowNets) written in JAX. gfnx provides an extensive set of environments and metrics for benchmarking, accompanied with single-file implementations of core objectives for training GFlowNets. We include synthetic hypergrids, multiple sequence generation environments with various editing regimes and particular reward designs for molecular generation, phylogenetic tree construction, Bayesian structure learning, and sampling from the Ising model energy. Across different tasks, gfnx achieves significant wall-clock speedups compared to Pytorch-based benchmarks (such as torchgfn library) and author implementations. For example, gfnx achieves up to 55 times speedup on CPU-based sequence generation environments, and up to 80 times speedup with the GPU-based Bayesian network structure learning setup. Our package provides a diverse set of benchmarks and aims to standardize empirical evaluation and accelerate research and applications of GFlowNets. The library is available on GitHub (https://github.com/d-tiapkin/gfnx) and on pypi (https://pypi.org/project/gfnx/). Documentation is available on https://gfnx.readthedocs.io.