Fake It Till You Make It: Towards Accurate Near-Distribution Novelty Detection

Mirzaei, Hossein, Salehi, Mohammadreza, Shahabi, Sajjad, Gavves, Efstratios, Snoek, Cees G. M., Sabokrou, Mohammad, Rohban, Mohammad Hossein

Nov-28-2022–arXiv.org Artificial Intelligence

We aim for image-based novelty detection. Despite considerable progress, existing models either fail or face a dramatic drop under the so-called "near-distribution" setting, where the differences between normal and anomalous samples are subtle. We first demonstrate existing methods experience up to 20% decrease in performance in the near-distribution setting. Next, we propose to exploit a score-based generative model to produce synthetic near-distribution anomalous data. Our model is then fine-tuned to distinguish such data from the normal samples. We provide a quantitative as well as qualitative evaluation of this strategy, and compare the results with a variety of GAN-based models. Effectiveness of our method for both the near-distribution and standard novelty detection is assessed through extensive experiments on datasets in diverse applications such as medical images, object classification, and quality control. This reveals that our method considerably improves over existing models, and consistently decreases the gap between the near-distribution and standard novelty detection performance. Such samples are called anomalous, while the training set is referred to as normal. One has access to only normal data during training in ND. Recently, PANDA (33) and CSI (43) have considerably pushed state-of-the-art and achieved more than 90% the area under the receiver operating characteristics (AUROC) on the CIFAR-10 dataset (22) in the ND task, where one class is assumed to be normal and the rest are considered anomalous. However, as we will show empirically, these methods struggle to achieve a similar performance in situations where outliers are semantically close to the normal distribution, e.g. In the literature novelty detection and anomaly detection are used interchangeably.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Nov-28-2022

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.82)

Industry:
- Health & Medicine > Diagnostic Medicine
  - Imaging (0.34)
- Transportation (0.32)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning
    - Performance Analysis > Accuracy (0.34)
  - Data Science > Data Mining
    - Anomaly Detection (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found