Improving Bayesian Inference in Deep Neural Networks with Variational Structured Dropout
Nguyen, Son, Nguyen, Duong, Nguyen, Khai, Ho, Nhat, Than, Khoat, Bui, Hung
Bayesian Neural Networks (BNNs) [37, 47] offer a probabilistic interpretation for deep learning models by imposing a prior distribution on the weight parameters and aim to obtain a posterior distribution instead of only point estimates. By marginalizing over this posterior for prediction, BNNs perform a procedure of ensemble learning. These principles facilitate the model to improve generalization, robustness and allow for uncertainty quantification. However, computing exactly the posterior of non-linear Bayesian networks is infeasible and approximate inference has been devised. The core challenge is how to construct an expressive approximation for the true posterior while maintaining computational efficiency and scalability, especially for modern deep learning architectures. Variational inference is a popular deterministic approximation approach to to deal with this challenge. The first practical methods are proposed in [15, 5, 28], in which, the approximate posterior is assumed to be a fully factorized distribution, also called mean-field variational inference. Generally, the mean-field approximation family encourages some advantages in inference including computational tractability and effective optimization with the stochastic gradient-based methods. However, it will ignore strong statistical dependencies among random weights of the neural networks, which leads to an inability to capture the complicated structure of the true posterior and to estimate true model uncertainty.
Feb-15-2021
- Country:
- Asia > Vietnam (0.14)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.14)
- North America
- Canada > Ontario
- Toronto (0.14)
- United States > Texas (0.14)
- Canada > Ontario
- Genre:
- Research Report (1.00)