AITopics | Müller, Klaus-Robert

Collaborating Authors

Müller, Klaus-Robert

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Wasserstein Training of Restricted Boltzmann Machines

Montavon, Grégoire, Müller, Klaus-Robert, Cuturi, Marco

Neural Information Processing SystemsFeb-14-2020, 14:41:51 GMT

Boltzmann machines are able to learn highly complex, multimodal, structured and multiscale real-world data distributions. Parameters of the model are usually learned by minimizing the Kullback-Leibler (KL) divergence from training samples to the learned model. We propose in this work a novel approach for Boltzmann machine training which assumes that a meaningful metric between observations is known. This metric between observations can then be used to define the Wasserstein distance between the distribution induced by the Boltzmann machine on the one hand, and that given by the training sample on the other hand. We derive a gradient of that distance with respect to the model parameters.

artificial intelligence, machine learning, wasserstein training, (3 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints

Sattler, Felix, Müller, Klaus-Robert, Samek, Wojciech

arXiv.org Machine LearningOct-4-2019

Federated Learning (FL) is currently the most widely adopted framework for collaborative training of (deep) machine learning models under privacy constraints. Albeit it's popularity, it has been observed that Federated Learning yields suboptimal results if the local clients' data distributions diverge. To address this issue, we present Clustered Federated Learning (CFL), a novel Federated Multi-Task Learning (FMTL) framework, which exploits geometric properties of the FL loss surface, to group the client population into clusters with jointly trainable data distributions. In contrast to existing FMTL approaches, CFL does not require any modifications to the FL communication protocol to be made, is applicable to general non-convex objectives (in particular deep neural networks) and comes with strong mathematical guarantees on the clustering quality. CFL is flexible enough to handle client populations that vary over time and can be implemented in a privacy preserving way. As clustering is only performed after Federated Learning has converged to a stationary point, CFL can be viewed as a post-processing method that will always achieve greater or equal performance than conventional FL by allowing clients to arrive at more specialized models. We verify our theoretical analysis in experiments with deep convolutional and recurrent neural networks on commonly used Federated Learning datasets.

deep learning, learning, neural network, (14 more...)

arXiv.org Machine Learning

1910.01991

Country:

Europe > Germany (0.46)
Asia (0.28)

Genre: Research Report (0.81)

Industry: Education (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Explainable Artificial Intelligence

Samek, Wojciech, Müller, Klaus-Robert

arXiv.org Artificial IntelligenceSep-26-2019

In recent years, machine learning (ML) has become a key enabling technology for the sciences and industry. Especially through improvements in methodology, the availability of large databases and increased computational power, today's ML algorithms are able to achieve excellent performance (at times even exceeding the human level) on an increasing number of complex tasks. Deep learning models are at the forefront of this development. However, due to their nested non-linear structure, these powerful models have been generally considered "black boxes", not providing any information about what exactly makes them arrive at their predictions. Since in many applications, e.g., in the medical domain, such lack of transparency may be not acceptable, the development of methods for visualizing, explaining and interpreting deep learning models has recently attracted increasing attention. This introductory paper presents recent developments and applications in this field and makes a plea for a wider use of explainable learning algorithms in practice.

deep learning, explanation, neural network, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-28954-6_1

1909.12072

Country: Europe > Germany (0.28)

Genre: Research Report (0.82)

Industry:

Law (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Leisure & Entertainment (0.68)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Explaining and Interpreting LSTMs

Arras, Leila, Arjona-Medina, Jose A., Widrich, Michael, Montavon, Grégoire, Gillhofer, Michael, Müller, Klaus-Robert, Hochreiter, Sepp, Samek, Wojciech

arXiv.org Machine LearningSep-25-2019

While neural networks have acted as a strong unifying force in the design of modern AI systems, the neural network architectures themselves remain highly heterogeneous due to the variety of tasks to be solved. In this chapter, we explore how to adapt the Layer-wise Relevance Propagation (LRP) technique used for explaining the predictions of feed-forward networks to the LSTM architecture used for sequential data modeling and forecasting. The special accumulators and gated interactions present in the LSTM require both a new propagation scheme and an extension of the underlying theoretical framework to deliver faithful explanations.

deep learning, neural network, relevance, (20 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-030-28954-6_11

1909.12114

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.67)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Transfer Learning For Whole-Brain fMRI Analyses

Thomas, Armin W., Müller, Klaus-Robert, Samek, Wojciech

arXiv.org Machine LearningJul-2-2019

The application of deep learning (DL) models to the decoding of cognitive states from whole-brain functional Magnetic Resonance Imaging (fMRI) data is often hindered by the small sample size and high dimensionality of these datasets. Especially, in clinical settings, where patient data are scarce. In this work, we demonstrate that transfer learning represents a solution to this problem. Particularly, we show that a DL model, which has been previously trained on a large openly available fMRI dataset of the Human Connectome Project, outperforms a model variant with the same architecture, but which is trained from scratch, when both are applied to the data of a new, unrelated fMRI task. Even further, the pre-trained DL model variant is already able to correctly decode 67.51% of the cognitive states from a test dataset with 100 individuals, when fine-tuned on a dataset of the size of only three subjects.

deep learning, neural network, variant, (23 more...)

arXiv.org Machine Learning

1907.01953

Country: Europe > Germany (0.48)

Genre: Research Report > Experimental Study (0.49)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Explanations can be manipulated and geometry is to blame

Dombrowski, Ann-Kathrin, Alber, Maximilian, Anders, Christopher J., Ackermann, Marcel, Müller, Klaus-Robert, Kessel, Pan

arXiv.org Machine LearningJun-19-2019

Explanation methods aim to make neural networks more trustworthy and interpretable. In this paper, we demonstrate a property of explanation methods which is disconcerting for both of these purposes. Namely, we show that explanations can be manipulated arbitrarily by applying visually hardly perceptible perturbations to the input that keep the network's output approximately constant. We establish theoretically that this phenomenon can be related to certain geometrical properties of neural networks. This allows us to derive an upper bound on the susceptibility of explanations to manipulations. Based on this result, we propose effective mechanisms to enhance the robustness of explanations.

artificial intelligence, explanation map, neural network, (18 more...)

arXiv.org Machine Learning

1906.07983

Country:

North America (0.46)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

From Clustering to Cluster Explanations via Neural Networks

Kauffmann, Jacob, Esders, Malte, Montavon, Grégoire, Samek, Wojciech, Müller, Klaus-Robert

arXiv.org Machine LearningJun-18-2019

A wealth of algorithms have been developed to extract natural cluster structure in data. Identifying this structure is desirable but not always sufficient: We may also want to understand why the data points have been assigned to a given cluster. Clustering algorithms do not offer a systematic answer to this simple question. Hence we propose a new framework that can, for the first time, explain cluster assignments in terms of input features in a comprehensive manner. It is based on the novel theoretical insight that clustering models can be rewritten as neural networks, or 'neuralized'. Predictions of the obtained networks can then be quickly and accurately attributed to the input features. Several showcases demonstrate the ability of our method to assess the quality of learned clusters and to extract novel insights from the analyzed data and representations.

deep learning, explanation, neural network, (19 more...)

arXiv.org Machine Learning

1906.07633

Country:

Europe (0.93)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Deep Semi-Supervised Anomaly Detection

Ruff, Lukas, Vandermeulen, Robert A., Görnitz, Nico, Binder, Alexander, Müller, Emmanuel, Müller, Klaus-Robert, Kloft, Marius

arXiv.org Machine LearningJun-6-2019

Deep approaches to anomaly detection have recently shown promising results over shallow approaches on high-dimensional data. Typically anomaly detection is treated as an unsupervised learning problem. In practice however, one may have---in addition to a large set of unlabeled samples---access to a small pool of labeled samples, e.g. a subset verified by some domain expert as being normal or anomalous. Semi-supervised approaches to anomaly detection make use of such labeled data to improve detection performance. Few deep semi-supervised approaches to anomaly detection have been proposed so far and those that exist are domain-specific. In this work, we present Deep SAD, an end-to-end methodology for deep semi-supervised anomaly detection. Using an information-theoretic perspective on anomaly detection, we derive a loss motivated by the idea that the entropy for the latent distribution of normal data should be lower than the entropy of the anomalous distribution. We demonstrate in extensive experiments on MNIST, Fashion-MNIST, and CIFAR-10 along with other anomaly detection benchmark datasets that our approach is on par or outperforms shallow, hybrid, and deep competitors, even when provided with only few labeled training data.

anomaly detection, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1906.02694

Country:

Europe > Germany (0.93)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (1.00)

Industry:

Education (0.88)
Government > Regional Government (0.68)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluating Recurrent Neural Network Explanations

Arras, Leila, Osman, Ahmed, Müller, Klaus-Robert, Samek, Wojciech

arXiv.org Machine LearningApr-29-2019

Recently, several methods have been proposed to explain the predictions of recurrent neural networks (RNNs), in particular of LSTMs. The goal of these methods is to understand the network's decisions by assigning to each input variable, e.g., a word, a relevance indicating to which extent it contributed to a particular prediction. In previous works, some of these methods were not yet compared to one another, or were evaluated only qualitatively. We close this gap by systematically and quantitatively comparing these methods in different settings, namely (1) a toy arithmetic task which we use as a sanity check, (2) a five-class sentiment prediction of movie reviews, and besides (3) we explore the usefulness of word relevances to build sentence-level representations. Lastly, using the method that performed best in our experiments, we show how specific linguistic phenomena such as the negation in sentiment analysis reflect in terms of relevance patterns, and how the relevance visualization can help to understand the misclassification of individual samples.

deep learning, neural network, relevance, (22 more...)

arXiv.org Machine Learning

1904.11829

Country: Europe > Germany (0.46)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comment on "Solving Statistical Mechanics Using VANs": Introducing saVANt - VANs Enhanced by Importance and MCMC Sampling

Nicoli, Kim, Kessel, Pan, Strodthoff, Nils, Samek, Wojciech, Müller, Klaus-Robert, Nakajima, Shinichi

arXiv.org Machine LearningMar-26-2019

In this comment on "Solving Statistical Mechanics Using Variational Autoregressive Networks" by Wu et al., we propose a subtle yet powerful modification of their approach. We show that the inherent sampling error of their method can be corrected by using neural network-based MCMC or importance sampling which leads to asymptotically unbiased estimators for physical quantities. This modification is possible due to a singular property of VANs, namely that they provide the exact sample probability. With these modifications, we believe that their method could have a substantially greater impact on various important fields of physics, including strongly-interacting field theories and statistical physics.

artificial intelligence, exp, machine learning, (16 more...)

arXiv.org Machine Learning

1903.11048

Country:

Europe > Germany (0.50)
Asia (0.47)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.72)

Add feedback