Improving Bayesian Inference in Deep Neural Networks with Variational Structured Dropout

Nguyen, Son, Nguyen, Duong, Nguyen, Khai, Ho, Nhat, Than, Khoat, Bui, Hung

Feb-15-2021–arXiv.org Machine Learning

Bayesian Neural Networks (BNNs) [37, 47] offer a probabilistic interpretation for deep learning models by imposing a prior distribution on the weight parameters and aim to obtain a posterior distribution instead of only point estimates. By marginalizing over this posterior for prediction, BNNs perform a procedure of ensemble learning. These principles facilitate the model to improve generalization, robustness and allow for uncertainty quantification. However, computing exactly the posterior of non-linear Bayesian networks is infeasible and approximate inference has been devised. The core challenge is how to construct an expressive approximation for the true posterior while maintaining computational efficiency and scalability, especially for modern deep learning architectures. Variational inference is a popular deterministic approximation approach to to deal with this challenge. The first practical methods are proposed in [15, 5, 28], in which, the approximate posterior is assumed to be a fully factorized distribution, also called mean-field variational inference. Generally, the mean-field approximation family encourages some advantages in inference including computational tractability and effective optimization with the stochastic gradient-based methods. However, it will ignore strong statistical dependencies among random weights of the neural networks, which leads to an inability to capture the complicated structure of the true posterior and to estimate true model uncertainty.

approximation, deep learning, neural network, (17 more...)

arXiv.org Machine Learning

Feb-15-2021

arXiv.org PDF

Add feedback

Country:
- Asia > Vietnam (0.14)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.14)
- North America
  - Canada > Ontario
    - Toronto (0.14)
  - United States > Texas (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (1.00)
    - Neural Networks > Deep Learning (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found