Goto

Collaborating Authors

 Bayesian Learning


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

Submitted by Assigned_Reviewer_1 Q1 The authors design and fit a hierarchical Bayesian model for predicting disease trajectories (i.e., a scalar measure of disease severity measured throughout the course of the disease) for individual patients. The overall model is an additive combination of a a number of terms including: (1) a population-level term, (2) a subpopulation term, (3) an individual term, (4) a GP term for structured errors. Each of these terms is a function of time, which is modeled parametrically in terms of the coefficients on pre-defined basis expansions (linear and/or B-splines). The subpopulation term involves a discrete mixture model, and the individual level term is a Bayesian linear regression. Distributions are chosen to be Gaussian, which makes most steps of inference and learning work out nicely.







Social-Inverse: Inverse Decision-making of Social Contagion Management with Task Migrations

Neural Information Processing Systems

Our main contribution is a generic framework, called Social-Inverse, for handling migrations between tasks of diffusion enhancement and diffusion containment. For Social-Inverse, we present theoretical analysis to obtain insights regarding how different contagion management tasks can be subtly correlated in order for samples from one task to help the optimization of another task.


ADMIRE-BayesOpt: Accelerated Data MIxture RE-weighting for Language Models with Bayesian Optimization

arXiv.org Machine Learning

Determining the optimal data mixture for large language model training remains a challenging problem with an outsized impact on performance. In practice, language model developers continue to rely on heuristic exploration since no learning-based approach has emerged as a reliable solution. In this work, we propose to view the selection of training data mixtures as a black-box hyperparameter optimization problem, for which Bayesian Optimization is a well-established class of appropriate algorithms. Firstly, we cast data mixture learning as a sequential decision-making problem, in which we aim to find a suitable trade-off between the computational cost of training exploratory (proxy-) models and final mixture performance. Secondly, we systematically explore the properties of transferring mixtures learned at a small scale to larger-scale experiments, providing insights and highlighting opportunities for research at a modest scale. By proposing Multi-fidelity Bayesian Optimization as a suitable method in this common scenario, we introduce a natural framework to balance experiment cost with model fit, avoiding the risks of overfitting to smaller scales while minimizing the number of experiments at high cost. We present results for pre-training and instruction finetuning across models ranging from 1 million to 7 billion parameters, varying from simple architectures to state-of-the-art models and benchmarks spanning dozens of datasets. We demonstrate consistently strong results relative to a wide range of baselines, resulting inspeed-ups of over 500% in determining the best data mixture on our largest experiments. In addition, we broaden access to research by sharing ADMIRE IFT Runs, a dataset of 460 full training & evaluation runs worth over 13,000 GPU hours, greatly reducing the cost of conducting research in this area.


BaMANI: Bayesian Multi-Algorithm causal Network Inference

arXiv.org Machine Learning

Improved computational power has enabled different disciplines to predict causal relationships among modeled variables using Bayesian network inference. While many alternative algorithms have been proposed to improve the efficiency and reliability of network prediction, the predicted causal networks reflect the generative process but also bear an opaque imprint of the specific computational algorithm used. Following a ``wisdom of the crowds" strategy, we developed an ensemble learning approach to marginalize the impact of a single algorithm on Bayesian causal network inference. To introduce the approach, we first present the theoretical foundation of this framework. Next, we present a comprehensive implementation of the framework in terms of a new software tool called BaMANI (Bayesian Multi-Algorithm causal Network Inference). Finally, we describe a BaMANI use-case from biology, particularly within human breast cancer studies.


Simulation-Based Inference: A Practical Guide

arXiv.org Machine Learning

A central challenge in many areas of science and engineering is to identify model parameters that are consistent with prior knowledge and empirical data. Bayesian inference offers a principled framework for this task, but can be computationally prohibitive when models are defined by stochastic simulators. Simulation-based Inference (SBI) is a suite of methods developed to overcome this limitation, which has enabled scientific discoveries in fields such as particle physics, astrophysics, and neuroscience. The core idea of SBI is to train neural networks on data generated by a simulator, without requiring access to likelihood evaluations. Once trained, inference is amortized: The neural network can rapidly perform Bayesian inference on empirical observations without requiring additional training or simulations. In this tutorial, we provide a practical guide for practitioners aiming to apply SBI methods. We outline a structured SBI workflow and offer practical guidelines and diagnostic tools for every stage of the process -- from setting up the simulator and prior, choosing and training inference networks, to performing inference and validating the results. We illustrate these steps through examples from astrophysics, psychophysics, and neuroscience. This tutorial empowers researchers to apply state-of-the-art SBI methods, facilitating efficient parameter inference for scientific discovery.