Near_OOD_with_pre_training (1).pdf
–Neural Information Processing Systems
While the typical accuracy a human reaches is often known for classification tasks, there is a lack of such benchmark for near-OOD detection. We decided to measure human performance on the task of distinguishing CIFAR-100 and CIFAR-10. The average AUROC weighted by the number of images in each trial is AUROC 95 . Figure 9 shows the outlier score on near-OOD task. Intuitively, the embeddings obtained by fine-tuning a pre-trained transformer are well-clustered, so just a handful of known outliers can significantly improve OOD detection.
Neural Information Processing Systems
Nov-13-2025, 22:37:34 GMT