KS(conf ): A Light-Weight Test if a ConvNet Operates Outside of Its Specifications

Sun, Rémy, Lampert, Christoph H.

arXiv.org Machine Learning 

Computer vision systems for automatic image categorization have become accurate and reliable enough that they can run continuously for days or even years as components of real-world commercial applications. A major open problem in this context, however, is quality control. Good classification performance can only be expected if systems run under the specific conditions, in particular data distributions, that they were trained for. Surprisingly, none of the currently used deep network architectures has a built-in functionality that could detect if a network operates on data from a distribution that it was not trained for and potentially trigger a warning to the human users. In this work, we describe KS(conf), a procedure for detecting such outside of the specifications operation. Building on statistical insights, its main step is the applications of a classical Kolmogorov-Smirnov test to the distribution of predicted confidence values. We show by extensive experiments using ImageNet, AwA2 and DAVIS data on a variety of ConvNets architectures that KS(conf) reliably detects out-of-specs situations. It furthermore has a number of properties that make it an excellent candidate for practical deployment: it is easy to implement, adds almost no overhead to the system, works with all networks, including pretrained ones, and requires no a priori knowledge about how the data distribution could change. 1 Introduction With the emergence of deep convolutional networks (ConvNets), computer vision systems have become accurate and reliable enough to perform tasks of practical relevance autonomously and reliably over long periods of time. This work was in parts funded by the European Research Council under the European Unions Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no 308036. C. H. Lampert, IST Austria, Email: chl@ist.ac.at 2 Rémy Sun, Christoph H. Lampert "ski" "shovel" "web site" "tennis ball" Figure 1 Illustration of within specification and outside of specifications behavior of a Conv-Net (here: VGG19, trained on ILSVRC2012). Left image: prediction on images that the network was trained to recognize. We observe: a standard multi-class network always predicts one of its predefined class labels, even if the current input is distorted, or even completely different, from what it was trained for. A major concern in our society about automatic decision systems is their reliability: if decisions are made by a trained classifier instead of a person, how can we be sure that the system works reliably now, and that it will continue to do so in the future?

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found