Goto

Collaborating Authors

Visualizing Image Content to Explain Novel Image Discovery

arXiv.org Machine Learning

The initial analysis of any large data set can be divided into two phases: (1) the identification of common trends or patterns and (2) the identification of anomalies or outliers that deviate from those trends. We focus on the goal of detecting observations with novel content, which can alert us to artifacts in the data set or, potentially, the discovery of previously unknown phenomena. To aid in interpreting and diagnosing the novel aspect of these selected observations, we recommend the use of novelty detection methods that generate explanations. In the context of large image data sets, these explanations should highlight what aspect of a given image is new (color, shape, texture, content) in a human-comprehensible form. We propose DEMUD-VIS, the first method for providing visual explanations of novel image content by employing a convolutional neural network (CNN) to extract image features, a method that uses reconstruction error to detect novel content, and an up-convolutional network to convert CNN feature representations back into image space. We demonstrate this approach on diverse images from ImageNet, freshwater streams, and the surface of Mars.


Guiding Scientific Discovery with Explanations Using DEMUD

AAAI Conferences

In the era of large scientific data sets, there is an urgent need for methods to automatically prioritize data for review. At the same time, for any automated method to be adopted by scientists, it must make decisions that they can understand and trust. In this paper, we propose Discovery through Eigenbasis Modeling of Uninteresting Data (DEMUD), which uses principal components modeling and reconstruction error to prioritize data. DEMUD’s major advance is to offer domain-specific explanations for its prioritizations. We evaluated DEMUD’s ability to quickly identify diverse items of interest and the value of the explanations it provides. We found that DEMUD performs as well or better than existing class discovery methods and provides, uniquely, the first explanations for why those items are of interest. Further, in collaborations with planetary scientists, we found that DEMUD (1) quickly identifies very rare items of scientific value, (2) maintains high diversity in its selections, and (3) provides explanations that greatly improve human classification accuracy.


Wagstaff

AAAI Conferences

In the era of large scientific data sets, there is an urgent need for methods to automatically prioritize data for review. At the same time, for any automated method to be adopted by scientists, it must make decisions that they can understand and trust. In this paper, we propose Discovery through Eigenbasis Modeling of Uninteresting Data (DEMUD), which uses principal components modeling and reconstruction error to prioritize data. DEMUD's major advance is to offer domain-specific explanations for its prioritizations. We evaluated DEMUD's ability to quickly identify diverse items of interest and the value of the explanations it provides. We found that DEMUD performs as well or better than existing class discovery methods and provides, uniquely, the first explanations for why those items are of interest. Further, in collaborations with planetary scientists, we found that DEMUD (1) quickly identifies very rare items of scientific value, (2) maintains high diversity in its selections, and (3) provides explanations that greatly improve human classification accuracy.


Invited Talks

AAAI Conferences

Abstracts of the invited talks presented at the AAAI Fall Symposium on Discovery Informatics: AI Takes a Science-Centered View on Big Data. Talks include  A Data Lifecycle Approach to Discovery Informatics,  Generating Biomedical Hypotheses Using Semantic Web Technologies,  Socially Intelligent Science, Representing and Reasoning with Experimental and Quasi-Experimental Designs, Bioinformatics Computation of Metabolic Models from Sequenced Genomes, Climate Informatics: Recent Advances and Challenge Problems for Machine Learning in Climate Science,  Predictive Modeling of Patient State and Therapy Optimization, Case Studies in Data-Driven Systems: Building Carbon Maps to Finding Neutrinos,  Computational Analysis of Complex Human Disorders, and Look at This Gem: Automated Data Prioritization for Scientific Discovery of Exoplanets, Mineral Deposits, and More.


Visual Explanation by Interpretation: Improving Visual Feedback Capabilities of Deep Neural Networks

arXiv.org Machine Learning

Learning-based representations have become the defacto means to address computer vision tasks. Despite their massive adoption, the amount of work aiming at understanding the internal representations learned by these models is rather limited. Existing methods aimed at model interpretation either require exhaustive manual inspection of visualizations, or link internal network activations with external "possibly useful" annotated concepts. We propose an intermediate scheme in which, given a pretrained model, we automatically identify internal features relevant for the set of classes considered by the model, without requiring additional annotations. We interpret the model through average visualizations of these features. Then, at test time, we explain the network prediction by accompanying the predicted class label with supporting heatmap visualizations derived from the identified relevant features. In addition, we propose a method to address the artifacts introduced by strided operations in deconvnet-based visualizations. Our evaluation on the MNIST, ILSVRC 12 and Fashion 144k datasets quantitatively shows that the proposed method is able to identify relevant internal features for the classes of interest while improving the quality of the produced visualizations.