Coresets for Scalable Bayesian Logistic Regression

Jonathan Huggins, Trevor Campbell, Tamara Broderick

Jan-20-2025, 08:34:44 GMT–Neural Information Processing Systems

The use of Bayesian methods in large-scale data settings is attractive because of the rich hierarchical models, uncertainty quantification, and prior specification they provide. Standard Bayesian inference algorithms are computationally expensive, however, making their direct application to large datasets difficult or infeasible. Recent work on scaling Bayesian inference has focused on modifying the underlying algorithms to, for example, use only a random data subsample at each iteration. We leverage the insight that data is often redundant to instead obtain a weighted subset of the data (called a coreset) that is much smaller than the original dataset. We can then use this small coreset in any number of existing posterior inference algorithms without modification.

artificial intelligence, bayesian inference, machine learning, (15 more...)

Neural Information Processing Systems

Jan-20-2025, 08:34:44 GMT

Conferences PDF

Add feedback

Genre:
- Research Report > New Finding (0.85)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (1.00)
    - Statistical Learning (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (1.00)