Posterior Refinement Improves Sample Efficiency in Bayesian Neural Networks Appendix A Derivation of the Multi-Class Probit Approximation

Neural Information Processing Systems 

This derivation first appeared in the first author's blog post [53]. Its derivation, based on Lu et al. For the HMC baseline, we use the default implementation of NUTS in Pyro. For the MAP, VB, and CSGHMC baselines, we use the same settings as Daxberger et al. The diagonal Hessian is used for CIFAR-100 and all-layer F-MNIST, while the full Hessian is used for other cases.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found