Utilising Gradient-Based Proposals Within Sequential Monte Carlo Samplers for Training of Partial Bayesian Neural Networks

Millard, Andrew, Murphy, Joshua, Maskell, Simon, Zhao, Zheng

May-8-2025–arXiv.org Machine Learning

Previous research has shown the benefit Bayesian methods can bring to certain problems within deep learning Gal et al. (2017). However, computing the exact posterior distributions of BNNs is a difficult task as traditional methods such as Markov chain Monte Carlo (MCMC) Hastings (1970) are computationally poorly suited to exploring high dimensional spaces and dealing with large amounts of data. Parametric methods such as variational inference are better suited to these difficulties, but only give an approximation to the posterior distribution. These spaces have been found to be highly complex Izmailov et al. (2021a) and therefore variational methods often give a poor approximation of the posterior. Sequential Monte Carlo (SMC) samplers Doucet et al. (2001) are an alternative to MCMC methods which also provide an empirical estimate of the posterior distribution. SMC samplers are instantly parallelisable Varsi et al. (2021b) and therefore can take advantage of the GPU resources commonly used in machine learning to speed up the training process. MCMC methods often require a warm-up period to adapt the hyperparameters, after which the chains can be parallelised. However, the hyperparameters must remain fixed after this warm-up period to obey stationarity. This means that SMC samplers can be more flexible than 1 arXiv:2505.03797v1

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

May-8-2025

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - California (0.05)
    - New York > New York County
      - New York City (0.04)
  - Canada > Ontario
    - Toronto (0.14)
- Europe > Sweden
  - Östergötland County > Linköping (0.04)

Genre:
- Research Report (0.65)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.66)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks > Deep Learning (0.66)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found