AITopics

2007.0176

Country: Europe > Germany (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceSep-28-2020

A Unifying Review of Deep and Shallow Anomaly Detection

Ruff, Lukas, Kauffmann, Jacob R., Vandermeulen, Robert A., Montavon, Grégoire, Samek, Wojciech, Kloft, Marius, Dietterich, Thomas G., Müller, Klaus-Robert

Deep learning approaches to anomaly detection have recently improved the state of the art in detection performance on complex datasets such as large collections of images or text. These results have sparked a renewed interest in the anomaly detection problem and led to the introduction of a great variety of new methods. With the emergence of numerous such methods, including approaches based on generative models, one-class classification, and reconstruction, there is a growing need to bring methods of this field into a systematic and unified perspective. In this review we aim to identify the common underlying principles as well as the assumptions that are often made implicitly by various methods. In particular, we draw connections between classic 'shallow' and novel deep approaches and show how this relation might cross-fertilize or extend both directions. We further provide an empirical assessment of major existing methods that is enriched by the use of recent explainability techniques, and present specific worked-through examples together with practical advice. Finally, we outline critical open challenges and identify specific paths for future research in anomaly detection.

anomaly detection, deep learning, law enforcement, (21 more...)

2009.11732

Country:

North America > United States (1.00)
Europe > Germany (0.67)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Energy > Oil & Gas (1.00)
(5 more...)

arXiv.org Machine LearningAug-31-2020

Langevin Cooling for Domain Translation

Srinivasan, Vignesh, Müller, Klaus-Robert, Samek, Wojciech, Nakajima, Shinichi

Domain translation is the task of finding correspondence between two domains. Several Deep Neural Network (DNN) models, e.g., CycleGAN and cross-lingual language models, have shown remarkable successes on this task under the unsupervised setting---the mappings between the domains are learned from two independent sets of training data in both domains (without paired samples). However, those methods typically do not perform well on a significant proportion of test samples. In this paper, we hypothesize that many of such unsuccessful samples lie at the fringe---relatively low-density areas---of data distribution, where the DNN was not trained very well, and propose to perform Langevin dynamics to bring such fringe samples towards high density areas. We demonstrate qualitatively and quantitatively that our strategy, called Langevin Cooling (L-Cool), enhances state-of-the-art methods in image translation and language translation tasks.

deep learning, l-cool, neural network, (20 more...)

2008.13723

Country:

Europe > Germany (0.46)
Asia (0.46)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

arXiv.org Machine LearningJul-20-2020

Fairwashing Explanations with Off-Manifold Detergent

Anders, Christopher J., Pasliev, Plamen, Dombrowski, Ann-Kathrin, Müller, Klaus-Robert, Kessel, Pan

Explanation methods promise to make black-box classifiers more transparent. As a result, it is hoped that they can act as proof for a sensible, fair and trustworthy decision-making process of the algorithm and thereby increase its acceptance by the end-users. In this paper, we show both theoretically and experimentally that these hopes are presently unfounded. Specifically, we show that, for any classifier $g$, one can always construct another classifier $\tilde{g}$ which has the same behavior on the data (same train, validation, and test error) but has arbitrarily manipulated explanation maps. We derive this statement theoretically using differential geometry and demonstrate it experimentally for various explanation methods, architectures, and datasets. Motivated by our theoretical insights, we then propose a modification of existing explanation methods which makes them significantly more robust.

deep learning, explanation, neural network, (20 more...)

2007.09969

Country:

North America > United States > California (0.14)
North America > United States > Hawaii (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

arXiv.org Artificial IntelligenceJun-18-2020

The Clever Hans Effect in Anomaly Detection

Kauffmann, Jacob, Ruff, Lukas, Montavon, Grégoire, Müller, Klaus-Robert

The 'Clever Hans' effect occurs when the learned model produces correct predictions based on the 'wrong' features. This effect which undermines the generalization capability of an ML model and goes undetected by standard validation techniques has been frequently observed for supervised learning where the training algorithm leverages spurious correlations in the data. The question whether Clever Hans also occurs in unsupervised learning, and in which form, has received so far almost no attention. Therefore, this paper will contribute an explainable AI (XAI) procedure that can highlight the relevant features used by popular anomaly detection models of different type. Our analysis reveals that the Clever Hans effect is widespread in anomaly detection and occurs in many (unexpected) forms. Interestingly, the observed Clever Hans effects are in this case not so much due to the data, but due to the anomaly detection models themselves whose structure makes them unable to detect the truly relevant features, even though vast amounts of data points are available. Overall, our work contributes a warning against an unrestrained use of existing anomaly detection models in practical applications, but it also points at a possible way out of the Clever Hans dilemma, specifically, by allowing multiple anomaly models to mutually cancel their individual structural weaknesses to jointly produce a better and more trustworthy anomaly detector.

artificial intelligence, data mining, explanation, (15 more...)

2006.10609

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceJun-16-2020

How Much Can I Trust You? -- Quantifying Uncertainties in Explaining Neural Networks

Bykov, Kirill, Höhne, Marina M. -C., Müller, Klaus-Robert, Nakajima, Shinichi, Kloft, Marius

Explainable AI (XAI) aims to provide interpretations for predictions made by learning machines, such as deep neural networks, in order to make the machines more transparent for the user and furthermore trustworthy also for applications in e.g. safety-critical areas. So far, however, no methods for quantifying uncertainties of explanations have been conceived, which is problematic in domains where a high confidence in explanations is a prerequisite. We therefore contribute by proposing a new framework that allows to convert any arbitrary explanation method for neural networks into an explanation method for Bayesian neural networks, with an in-built modeling of uncertainties. Within the Bayesian framework a network's weights follow a distribution that extends standard single explanation scores and heatmaps to distributions thereof, in this manner translating the intrinsic network model uncertainties into a quantification of explanation uncertainties. This allows us for the first time to carve out uncertainties associated with a model explanation and subsequently gauge the appropriate level of explanation confidence for a user (using percentiles). We demonstrate the effectiveness and usefulness of our approach extensively in various experiments, both qualitatively and quantitatively.

deep learning, explanation, neural network, (22 more...)

2006.09

Country: Europe > Germany (0.47)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area (0.94)
Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

arXiv.org Artificial IntelligenceJun-12-2020

XAI for Graphs: Explaining Graph Neural Network Predictions by Identifying Relevant Walks

Schnake, Thomas, Eberle, Oliver, Lederer, Jonas, Nakajima, Shinichi, Schütt, Kristof T., Müller, Klaus-Robert, Montavon, Grégoire

Graph Neural Networks (GNNs) are a popular approach for predicting graph structured data. As GNNs tightly entangle the input graph into the neural network structure, common explainable AI (XAI) approaches are not applicable. To a large extent, GNNs have remained black-boxes for the user so far. In this paper, we contribute by proposing a new XAI approach for GNNs. Our approach is derived from high-order Taylor expansions and is able to generate a decomposition of the GNN prediction as a collection of relevant walks on the input graph. We find that these high-order Taylor expansions can be equivalently (and more simply) computed using multiple backpropagation passes from the top layer of the GNN to the first layer. The explanation can then be further robustified and generalized by using layer-wise-relevance propagation (LRP) in place of the standard equations for gradient propagation. Our novel method which we denote as 'GNN-LRP' is tested on scale-free graphs, sentence parsing trees, molecular graphs, and pixel lattices representing images. In each case, it performs stably and accurately, and delivers interesting and novel application insights.

deep learning, neural network, prediction, (23 more...)

2006.03589

Genre:

Research Report (1.00)
Overview (0.66)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningMay-30-2020

Rethinking Assumptions in Deep Anomaly Detection

Ruff, Lukas, Vandermeulen, Robert A., Franks, Billy Joe, Müller, Klaus-Robert, Kloft, Marius

Though anomaly detection (AD) can be viewed as a classification problem (nominal vs. anomalous) it is usually treated in an unsupervised manner since one typically does not have access to, or it is infeasible to utilize, a dataset that sufficiently characterizes what it means to be "anomalous." In this paper we present results demonstrating that this intuition surprisingly does not extend to deep AD on images. For a recent AD benchmark on ImageNet, classifiers trained to discern between normal samples and just a few (64) random natural images are able to outperform the current state of the art in deep AD. We find that this approach is also very effective at other common image AD benchmarks. Experimentally we discover that the multiscale structure of image data makes example anomalies exceptionally informative.

benchmark, deep learning, neural network, (19 more...)

2006.00339

Country:

Europe > Germany (0.46)
North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Machine LearningMay-4-2020

Ensemble Learning of Coarse-Grained Molecular Dynamics Force Fields with a Kernel Approach

Wang, Jiang, Chmiela, Stefan, Müller, Klaus-Robert, Noè, Frank, Clementi, Cecilia

Gradient-domain machine learning (GDML) is an accurate and efficient approach to learn a molecular potential and associated force field based on the kernel ridge regression algorithm. Here, we demonstrate its application to learn an effective coarse-grained (CG) model from all-atom simulation data in a sample efficient manner. The coarse-grained force field is learned by following the thermodynamic consistency principle, here by minimizing the error between the predicted coarse-grained force and the all-atom mean force in the coarse-grained coordinates. Solving this problem by GDML directly is impossible because coarse-graining requires averaging over many training data points, resulting in impractical memory requirements for storing the kernel matrices. In this work, we propose a data-efficient and memory-saving alternative. Using ensemble learning and stratified sampling, we propose a 2-layer training scheme that enables GDML to learn an effective coarse-grained model. We illustrate our method on a simple biomolecular system, alanine dipeptide, by reconstructing the free energy landscape of a coarse-grained variant of this molecule. Our novel GDML training scheme yields a smaller free energy error than neural networks when the training set is small, and a comparably high accuracy when the training set is sufficiently large.

artificial intelligence, gdml model, neural network, (17 more...)

doi: 10.1063/5.0007276

2005.01851

Country:

North America > United States (0.93)
Europe (0.68)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Neural Information Processing SystemsFeb-15-2020, 02:42:03 GMT

Layer-wise analysis of deep networks with Gaussian kernels

Montavon, Grégoire, Müller, Klaus-Robert, Braun, Mikio L.

Deep networks can potentially express a learning problem more efficiently than local learning machines. While deep networks outperform local learning machines on some problems, it is still unclear how their nice representation emerges from their complex structure. We present an analysis based on Gaussian kernels that measures how the representation of the learning problem evolves layer after layer as the deep network builds higher-level abstract representations of the input. We use this analysis to show empirically that deep networks build progressively better representations of the learning problem and that the best representations are obtained when the deep network discriminates only in the last layers. Papers published at the Neural Information Processing Systems Conference.

artificial intelligence, gaussian kernel, machine learning, (4 more...)

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)