layer predictive normalized maximum likelihood
Single Layer Predictive Normalized Maximum Likelihood for Out-of-Distribution Detection-Supplementary material-Anonymous Author(s) Affiliation Address email
We use the same notations as in section 4.2 Denote ec as a one-hot row vector of the true label, we define the hypothesis set that genie is allowed3 to choose from as4 PΘ = pθ(y|x) = 1 2πσ2 exp 1 2σ2 y f(x>nθ) e>c We simulate the response of the pNML regret for two classes (C=2) and divide it by logC to have11 the regret bounded between 0 and 1. Figure 1 shows the regret behaviour for different p1 (the ERM12 probability assignment of class 1) as a function of x>g.13 For an ERM model that is certain on the prediction (p1 = 0.99 that is represented by the purple14 curve), a slight variation of x>g causes a large response of the regret comparing to p1 that equals15 0.55 and 0.85. Next, 20 we compute the correlation matrix of the training embeddings and perform an SVD decomposition. For the SVHN training set, most of the energy is located in the first 50 eigenvalues and then 24 there is a significant decrease of approximately 103. The same phenomenon is also seen in figure 2a 25 that shows the eigenvalues of ResNet-40 model.
Single Layer Predictive Normalized Maximum Likelihood for Out-of-Distribution Detection
Detecting out-of-distribution (OOD) samples is vital for developing machine learning based models for critical safety systems. Common approaches for OOD detection assume access to some OOD samples during training which may not be available in a real-life scenario. Instead, we utilize the {\em predictive normalized maximum likelihood} (pNML) learner, in which no assumptions are made on the tested input. We derive an explicit expression of the pNML and its generalization error, denoted as the regret, for a single layer neural network (NN). We show that this learner generalizes well when (i) the test vector resides in a subspace spanned by the eigenvectors associated with the large eigenvalues of the empirical correlation matrix of the training data, or (ii) the test sample is far from the decision boundary. Furthermore, we describe how to efficiently apply the derived pNML regret to any pretrained deep NN, by employing the explicit pNML for the last layer, followed by the softmax function. Applying the derived regret to deep NN requires neither additional tunable parameters nor extra data. We extensively evaluate our approach on 74 OOD detection benchmarks using DenseNet-100, ResNet-34, and WideResNet-40 models trained with CIFAR-100, CIFAR-10, SVHN, and ImageNet-30 showing a significant improvement of up to 15.6% over recent leading methods.
Single Layer Predictive Normalized Maximum Likelihood for Out-of-Distribution Detection
Detecting out-of-distribution (OOD) samples is vital for developing machine learning based models for critical safety systems. Common approaches for OOD detection assume access to some OOD samples during training which may not be available in a real-life scenario. Instead, we utilize the {\em predictive normalized maximum likelihood} (pNML) learner, in which no assumptions are made on the tested input. We derive an explicit expression of the pNML and its generalization error, denoted as the regret, for a single layer neural network (NN). We show that this learner generalizes well when (i) the test vector resides in a subspace spanned by the eigenvectors associated with the large eigenvalues of the empirical correlation matrix of the training data, or (ii) the test sample is far from the decision boundary.