A statistical theory of out-of-distribution detection
We introduce a principled approach to detecting out-of-distribution (OOD) data by exploiting a connection to data curation. In data curation, we exclude ambiguous or difficult-to-classify input points from the dataset, and these excluded points are by definition OOD. We can therefore obtain the likelihood for OOD points by using a principled generative model of data-curation initially developed to explain the cold-posterior effect in Bayesian neural networks (Aitchison 2020). This model gives higher OOD probabilities when predictive uncertainty is higher and can be trained using maximum-likelihood jointly over the in-distribution and OOD points. This approach gives superior performance to past methods that did not provide a probability for OOD points, and therefore could not be trained using maximum-likelihood.
Feb-24-2021
- Country:
- Asia > China (0.04)
- North America > United States
- Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom
- Genre:
- Research Report (0.40)
- Technology: