Goto

Collaborating Authors

 epist


A Dirichlet Distribution Computations A.1 Dirichlet distribution The Dirichlet distribution with concentration parameters α = (α

Neural Information Processing Systems

The novel Bayesian loss described in formula 7 can be computed in closed form. For vector datasets, all models share an architecture of 3 linear layers with Relu activation. For PostNet, we used a 1D batch normalization after the encoder. All metrics have been scaled by 100 . We obtain numbers in [0, 100] for all scores instead of [0, 1].


OOD K. α

Neural Information Processing Systems

Based on R1's comments we also evaluated the models based on mutual Theoretically, the two metrics bring similar information [C]. For these reasons, we decided to use APR. We attribute the strong performance of PostNet to the dim. Similar conclusions have been drawn in [E]. In our paper we use 5 random splits (60%, 20%, 20%).


A Dirichlet Distribution Computations A.1 Dirichlet distribution The Dirichlet distribution with concentration parameters α = (α

Neural Information Processing Systems

The novel Bayesian loss described in formula 7 can be computed in closed form. For vector datasets, all models share an architecture of 3 linear layers with Relu activation. For PostNet, we used a 1D batch normalization after the encoder. All metrics have been scaled by 100 . We obtain numbers in [0, 100] for all scores instead of [0, 1].


Natural Posterior Network: Deep Bayesian Predictive Uncertainty for Exponential Family Distributions

Charpentier, Bertrand, Borchert, Oliver, Zügner, Daniel, Geisler, Simon, Günnemann, Stephan

arXiv.org Machine Learning

Uncertainty awareness is crucial to develop reliable machine learning models. In this work, we propose the Natural Posterior Network (NatPN) for fast and high-quality uncertainty estimation for any task where the target distribution belongs to the exponential family. Thus, NatPN finds application for both classification and general regression settings. Unlike many previous approaches, NatPN does not require out-of-distribution (OOD) data at training time. Instead, it leverages Normalizing Flows to fit a single density on a learned low-dimensional and task-dependent latent space. For any input sample, NatPN uses the predicted likelihood to perform a Bayesian update over the target distribution. Theoretically, NatPN assigns high uncertainty far away from training data. Empirically, our extensive experiments on calibration and OOD detection show that NatPN delivers highly competitive performance for classification, regression and count prediction tasks.