AITopics | Samek, Wojciech

Collaborating Authors

Samek, Wojciech

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Explaining Recurrent Neural Network Predictions in Sentiment Analysis

Arras, Leila, Montavon, Grégoire, Müller, Klaus-Robert, Samek, Wojciech

arXiv.org Machine LearningAug-4-2017

Recently, a technique called Layer-wise Relevance Propagation (LRP) was shown to deliver insightful explanations in the form of input space relevances for understanding feed-forward neural network classification decisions. In the present work, we extend the usage of LRP to recurrent neural networks. We propose a specific propagation rule applicable to multiplicative connections as they arise in recurrent network architectures such as LSTMs and GRUs. We apply our technique to a word-based bi-directional LSTM model on a five-class sentiment prediction task, and evaluate the resulting LRP relevances both qualitatively and quantitatively, obtaining better results than a gradient-based related method which was used in previous work.

deep learning, neural network, relevance, (20 more...)

arXiv.org Machine Learning

1706.07206

Country: Europe > Germany (0.28)

Genre: Research Report (1.00)

Industry: Media (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Object Boundary Detection and Classification with Image-level Labels

Koh, Jing Yu, Samek, Wojciech, Müller, Klaus-Robert, Binder, Alexander

arXiv.org Machine LearningJun-25-2017

Semantic boundary and edge detection aims at simultaneously detecting object edge pixels in images and assigning class labels to them. Systematic training of predictors for this task requires the labeling of edges in images which is a particularly tedious task. We propose a novel strategy for solving this task, when pixel-level annotations are not available, performing it in an almost zero-shot manner by relying on conventional whole image neural net classifiers that were trained using large bounding boxes. Our method performs the following two steps at test time. Firstly it predicts the class labels by applying the trained whole image network to the test images. Secondly, it computes pixel-wise scores from the obtained predictions by applying backprop gradients as well as recent visualization algorithms such as deconvolution and layer-wise relevance propagation. We show that high pixel-wise scores are indicative for the location of semantic boundaries, which suggests that the semantic boundary problem can be approached without using edge labels during the training phase.

artificial intelligence, neural network, pixel, (9 more...)

arXiv.org Machine Learning

1606.09187

Country:

Europe > Germany (0.14)
Asia (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Methods for Interpreting and Understanding Deep Neural Networks

Montavon, Grégoire, Samek, Wojciech, Müller, Klaus-Robert

arXiv.org Machine LearningJun-24-2017

This paper provides an entry point to the problem of interpreting a deep neural network model and explaining its predictions. It is based on a tutorial given at ICASSP 2017. It introduces some recently proposed techniques of interpretation, along with theory, tricks and recommendations, to make most efficient use of these techniques on real data. It also discusses a number of practical applications.

deep learning, explanation, neural network, (20 more...)

arXiv.org Machine Learning

doi: 10.1016/j.dsp.2017.10.011

1706.07979

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Instructional Material > Course Syllabus & Notes (0.66)
Research Report (0.50)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Sharing Hash Codes for Multiple Purposes

Pronobis, Wikor, Panknin, Danny, Kirschnick, Johannes, Srinivasan, Vignesh, Samek, Wojciech, Markl, Volker, Kaul, Manohar, Mueller, Klaus-Robert, Nakajima, Shinichi

arXiv.org Machine LearningJun-1-2017

Locality sensitive hashing (LSH) is a powerful tool for sublinear-time approximate nearest neighbor search, and a variety of hashing schemes have been proposed for different dissimilarity measures. However, hash codes significantly depend on the dissimilarity, which prohibits users from adjusting the dissimilarity at query time. In this paper, we propose {multiple purpose LSH (mp-LSH) which shares the hash codes for different dissimilarities. mp-LSH supports L2, cosine, and inner product dissimilarities, and their corresponding weighted sums, where the weights can be adjusted at query time. It also allows us to modify the importance of pre-defined groups of features. Thus, mp-LSH enables us, for example, to retrieve similar items to a query with the user preference taken into account, to find a similar material to a query with some properties (stability, utility, etc.) optimized, and to turn on or off a part of multi-modal information (brightness, color, audio, text, etc.) in image/video retrieval. We theoretically and empirically analyze the performance of three variants of mp-LSH, and demonstrate their usefulness on real-world data sets.

artificial intelligence, dissimilarity, natural language, (19 more...)

arXiv.org Machine Learning

1609.03219

Country: Europe > Germany (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.66)

Add feedback

"What is Relevant in a Text Document?": An Interpretable Machine Learning Approach

Arras, Leila, Horn, Franziska, Montavon, Grégoire, Müller, Klaus-Robert, Samek, Wojciech

arXiv.org Machine LearningDec-22-2016

Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML) models have been trained to automatically map documents to these abstract concepts, allowing to annotate very large text collections, more than could be processed by a human in a lifetime. Besides predicting the text's category very accurately, it is also highly desirable to understand how and why the categorization process takes place. In this paper, we demonstrate that such understanding can be achieved by tracing the classification decision back to individual words using layer-wise relevance propagation (LRP), a recently developed technique for explaining predictions of complex non-linear classifiers. We train two word-based ML models, a convolutional neural network (CNN) and a bag-of-words SVM classifier, on a topic categorization task and adapt the LRP method to decompose the predictions of these models onto words. Resulting scores indicate how much individual words contribute to the overall classification decision. This enables one to distill relevant information from text documents without an explicit semantic information extraction step. We further use the word-wise relevance scores for generating novel vector-based document representations which capture semantic information. Based on these document vectors, we introduce a measure of model explanatory power and show that, although the SVM and CNN models perform similarly in terms of classification accuracy, the latter exhibits a higher level of explainability which makes it more comprehensible for humans and potentially more useful for other applications.

neural network, relevance, text processing, (22 more...)

arXiv.org Machine Learning

doi: 10.1371/journal.pone.0181142

1612.07843

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.90)

Add feedback

Interpreting the Predictions of Complex ML Models by Layer-wise Relevance Propagation

Samek, Wojciech, Montavon, Grégoire, Binder, Alexander, Lapuschkin, Sebastian, Müller, Klaus-Robert

arXiv.org Machine LearningNov-24-2016

Complex nonlinear models such as deep neural network (DNNs) have become an important tool for image classification, speech recognition, natural language processing, and many other fields of application. These models however lack transparency due to their complex nonlinear structure and to the complex data distributions to which they typically apply. As a result, it is difficult to fully characterize what makes these models reach a particular decision for a given input. This lack of transparency can be a drawback, especially in the context of sensitive applications such as medical analysis or security. In this short paper, we summarize a recent technique introduced by Bach et al. [1] that explains predictions by decomposing the classification decision of DNN models in terms of input variables.

deep learning, neural network, prediction, (19 more...)

arXiv.org Machine Learning

1611.08191

Country:

Europe > Germany (0.16)
Europe > Spain (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Identifying individual facial expressions by deconstructing a neural network

Arbabzadah, Farhad, Montavon, Grégoire, Müller, Klaus-Robert, Samek, Wojciech

arXiv.org Machine LearningJun-25-2016

This paper focuses on the problem of explaining predictions of psychological attributes such as attractiveness, happiness, confidence and intelligence from face photographs using deep neural networks. Since psychological attribute datasets typically suffer from small sample sizes, we apply transfer learning with two base models to avoid overfitting. These models were trained on an age and gender prediction task, respectively. Using a novel explanation method we extract heatmaps that highlight the parts of the image most responsible for the prediction. We further observe that the explanation method provides important insights into the nature of features of the base model, which allow one to assess the aptitude of the base model for a given transfer learning task. Finally, we observe that the multiclass model is more feature rich than its binary counterpart. The experimental evaluation is performed on the 2222 images from the 10k US faces dataset containing psychological attribute labels as well as on a subset of KDEF images.

deep learning, identifying individual facial expression, neural network, (15 more...)

arXiv.org Machine Learning

1606.07285

Country: Europe > Germany (0.29)

Genre: Research Report (0.91)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Explaining Predictions of Non-Linear Classifiers in NLP

Arras, Leila, Horn, Franziska, Montavon, Grégoire, Müller, Klaus-Robert, Samek, Wojciech

arXiv.org Machine LearningJun-23-2016

Layer-wise relevance propagation (LRP) is a recently proposed technique for explaining predictions of complex non-linear classifiers in terms of input variables. In this paper, we apply LRP for the first time to natural language processing (NLP). More precisely, we use it to explain the predictions of a convolutional neural network (CNN) trained on a topic categorization task. Our analysis highlights which words are relevant for a specific prediction of the CNN. We compare our technique to standard sensitivity analysis, both qualitatively and quantitatively, using a "word deleting" perturbation experiment, a PCA analysis, and various visualizations. All experiments validate the suitability of LRP for explaining the CNN predictions, which is also in line with results reported in recent image classification studies.

deep learning, neural network, relevance, (20 more...)

arXiv.org Machine Learning

1606.07298

Country: Europe > Germany (0.28)

Genre: Research Report (0.50)

Industry: Government > Regional Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Interpretable Deep Neural Networks for Single-Trial EEG Classification

Sturm, Irene, Bach, Sebastian, Samek, Wojciech, Müller, Klaus-Robert

arXiv.org Machine LearningApr-27-2016

Background: In cognitive neuroscience the potential of Deep Neural Networks (DNNs) for solving complex classification tasks is yet to be fully exploited. The most limiting factor is that DNNs as notorious 'black boxes' do not provide insight into neurophysiological phenomena underlying a decision. Layer-wise Relevance Propagation (LRP) has been introduced as a novel method to explain individual network decisions. New Method: We propose the application of DNNs with LRP for the first time for EEG data analysis. Through LRP the single-trial DNN decisions are transformed into heatmaps indicating each data point's relevance for the outcome of the decision. Results: DNN achieves classification accuracies comparable to those of CSP-LDA. In subjects with low performance subject-to-subject transfer of trained DNNs can improve the results. The single-trial LRP heatmaps reveal neurophysiologically plausible patterns, resembling CSP-derived scalp maps. Critically, while CSP patterns represent class-wise aggregated information, LRP heatmaps pinpoint neural patterns to single time points in single trials. Comparison with Existing Method(s): We compare the classification performance of DNNs to that of linear CSP-LDA on two data sets related to motor-imaginery BCI. Conclusion: We have demonstrated that DNN is a powerful non-linear tool for EEG analysis. With LRP a new quality of high-resolution assessment of neural activity can be reached. LRP is a potential remedy for the lack of interpretability of DNNs that has limited their utility in neuroscientific applications. The extreme specificity of the LRP-derived heatmaps opens up new avenues for investigating neural activity underlying complex perception or decision-related processes.

deep learning, dnn, neural network, (20 more...)

arXiv.org Machine Learning

1604.08201

Country: Europe > Germany (0.15)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Explaining NonLinear Classification Decisions with Deep Taylor Decomposition

Montavon, Grégoire, Bach, Sebastian, Binder, Alexander, Samek, Wojciech, Müller, Klaus-Robert

arXiv.org Machine LearningDec-8-2015

Nonlinear methods such as Deep Neural Networks (DNNs) are the gold standard for various challenging machine learning problems, e.g., image classification, natural language processing or human action recognition. Although these methods perform impressively well, they have a significant disadvantage, the lack of transparency, limiting the interpretability of the solution and thus the scope of application in practice. Especially DNNs act as black boxes due to their multilayer nonlinear structure. In this paper we introduce a novel methodology for interpreting generic multilayer neural networks by decomposing the network classification decision into contributions of its input elements. Although our focus is on image classification, the method is applicable to a broad set of input data, learning tasks and network architectures. Our method is based on deep Taylor decomposition and efficiently utilizes the structure of the network by backpropagating the explanations from the output to the input layer. We evaluate the proposed method empirically on the MNIST and ILSVRC data sets.

deep learning, neural network, relevance, (18 more...)

arXiv.org Machine Learning

doi: 10.1016/j.patcog.2016.11.008

1512.02479

Country:

Europe > Germany (0.28)
North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.50)

Industry:

Education (0.48)
Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback