AITopics | Rätsch, Gunnar

Collaborating Authors

Rätsch, Gunnar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning

Immer, Alexander, Bauer, Matthias, Fortuin, Vincent, Rätsch, Gunnar, Khan, Mohammad Emtiyaz

arXiv.org Machine LearningMay-11-2021

Marginal-likelihood based model-selection, even though promising, is rarely used in deep learning due to estimation difficulties. Instead, most approaches rely on validation data, which may not be readily available. In this work, we present a scalable marginal-likelihood estimation method to select both the hyperparameters and network architecture based on the training data alone. Some hyperparameters can be estimated online during training, simplifying the procedure. Our marginal-likelihood estimate is based on Laplace's method and Gauss-Newton approximations to the Hessian, and it outperforms cross-validation and manual-tuning on standard regression and image classification datasets, especially in terms of calibration and out-of-distribution detection. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable (e.g., in nonstationary settings).

approximation, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

2104.04975

Country:

North America > United States > Arizona (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.50)

Add feedback

Bayesian Neural Network Priors Revisited

Fortuin, Vincent, Garriga-Alonso, Adrià, Wenzel, Florian, Rätsch, Gunnar, Turner, Richard, van der Wilk, Mark, Aitchison, Laurence

arXiv.org Machine LearningFeb-12-2021

In a Bayesian neural network (BNN), we specify a prior p(w) over the neural network parameters, and compute the posterior distribution over parameters conditioned on training data, p(w x, y) p(y w, x)p(w)/p(y x). This procedure should give considerable advantages for reasoning about predictive uncertainty, which is especially relevant in the small-data setting. Crucially, to perform Bayesian inference, we need to choose a prior that accurately reflects our beliefs about the parameters before seeing any data (Bayes, 1763; Gelman et al., 2013). However, the most common choice of the prior for BNN weights is the simplest one: the isotropic Gaussian. Isotropic Gaussians are used across almost all fields of Bayesian deep learning, ranging from variational inference (Blundell et al., 2015; Dusenberry et al., 2020), to sampling-based inference (Zhang et al., 2019), and even to infinite networks (Lee et al., 2017; Garriga-Alonso et al., 2019). This is troubling, since isotropic Gaussian priors are almost certainly not the best choice. Indeed, despite the progress on more accurate and efficient inference procedures, in most settings, the posterior predictive of BNNs using a Gaussian prior still leads to worse predictive performance than a baseline obtained by training the network with standard stochastic gradient descent (SGD) (e.g., Zhang et al., 2019; Heek & Kalchbrenner, 2019; Wenzel et al., 2020a). However, it has been shown that the performance of BNNs can be improved by artificially reducing posterior uncertainty using "cold posteriors" (Wenzel et al., 2020a).

cold posterior effect, deep learning, neural network, (15 more...)

arXiv.org Machine Learning

2102.06571

Country:

Europe (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Add feedback

On Disentanglement in Gaussian Process Variational Autoencoders

Bing, Simon, Fortuin, Vincent, Rätsch, Gunnar

arXiv.org Machine LearningFeb-10-2021

Complex multivariate time series arise in many fields, ranging from computer vision to robotics or medicine. Often we are interested in the independent underlying factors that give rise to the high-dimensional data we are observing. While many models have been introduced to learn such disentangled representations, only few attempt to explicitly exploit the structure of sequential data. We investigate the disentanglement properties of Gaussian process variational autoencoders, a class of models recently introduced that have been successful in different tasks on time series data. Our model exploits the temporal structure of the data by modeling each latent channel with a GP prior and employing a structured variational distribution that can capture dependencies in time. We demonstrate the competitiveness of our approach against state-of-the-art unsupervised and weakly-supervised disentanglement methods on a benchmark task. Moreover, we provide evidence that we can learn meaningful disentangled representations on real-world medical time series data.

deep learning, representation, vascular disease, (19 more...)

arXiv.org Machine Learning

2102.05507

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

WRSE -- a non-parametric weighted-resolution ensemble for predicting individual survival distributions in the ICU

Heitz, Jonathan, Ficek, Joanna, Faltys, Martin, Merz, Tobias M., Rätsch, Gunnar, Hüser, Matthias

arXiv.org Machine LearningNov-2-2020

Dynamic assessment of mortality risk in the intensive care unit (ICU) can be used to stratify patients, inform about treatment effectiveness or serve as part of an early-warning system. Static risk scoring systems, such as APACHE or SAPS, have recently been supplemented with data-driven approaches that track the dynamic mortality risk over time. Recent works have focused on enhancing the information delivered to clinicians even further by producing full survival distributions instead of point predictions or fixed horizon risks. In this work, we propose a non-parametric ensemble model, Weighted Resolution Survival Ensemble (WRSE), tailored to estimate such dynamic individual survival distributions. Inspired by the simplicity and robustness of ensemble methods, the proposed approach combines a set of binary classifiers spaced according to a decay function reflecting the relevance of short-term mortality predictions. Models and baselines are evaluated under weighted calibration and discrimination metrics for individual survival distributions which closely reflect the utility of a model in ICU practice. We show competitive results with state-of-the-art probabilistic models, while greatly reducing training time by factors of 2-9x.

base model, health & medicine, neural network, (21 more...)

arXiv.org Machine Learning

2011.00865

Country:

Europe > Switzerland (0.68)
North America > United States (0.46)

Genre: Research Report > New Finding (0.94)

Industry:

Health & Medicine > Health Care Providers & Services (0.89)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Sober Look at the Unsupervised Learning of Disentangled Representations and their Evaluation

Locatello, Francesco, Bauer, Stefan, Lucic, Mario, Rätsch, Gunnar, Gelly, Sylvain, Schölkopf, Bernhard, Bachem, Olivier

arXiv.org Machine LearningOct-27-2020

The idea behind the \emph{unsupervised} learning of \emph{disentangled} representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data. Then, we train over $14000$ models covering most prominent methods and evaluation metrics in a reproducible large-scale experimental study on eight data sets. We observe that while the different methods successfully enforce properties "encouraged" by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision. Furthermore, different evaluation metrics do not always agree on what should be considered "disentangled" and exhibit systematic differences in the estimation. Finally, increased disentanglement does not seem to necessarily lead to a decreased sample complexity of learning for downstream tasks. Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision, investigate concrete benefits of enforcing disentanglement of the learned representations, and consider a reproducible experimental setup covering several data sets.

neural network, representation, survey article, (18 more...)

arXiv.org Machine Learning

2010.14766

Country:

Europe > Switzerland (0.28)
Europe > Germany (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

A Commentary on the Unsupervised Learning of Disentangled Representations

Locatello, Francesco, Bauer, Stefan, Lucic, Mario, Rätsch, Gunnar, Gelly, Sylvain, Schölkopf, Bernhard, Bachem, Olivier

arXiv.org Artificial IntelligenceJul-28-2020

The goal of the unsupervised learning of disentangled representations is to separate the independent explanatory factors of variation in the data without access to supervision. In this paper, we summarize the results of Locatello et al., 2019, and focus on their implications for practitioners. We discuss the theoretical result showing that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases and the practical challenges it entails. Finally, we comment on our experimental findings, highlighting the limitations of state-of-the-art approaches and directions for future research.

artificial intelligence, machine learning, representation, (14 more...)

arXiv.org Artificial Intelligence

2007.14184

Genre: Research Report > New Finding (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.91)

Add feedback

An Empirical Analysis of Domain Adaptation Algorithms for Genomic Sequence Analysis

Schweikert, Gabriele, Rätsch, Gunnar, Widmer, Christian, Schölkopf, Bernhard

Neural Information Processing SystemsFeb-15-2020, 03:14:10 GMT

We study the problem of domain transfer for a supervised classification task in mRNA splicing. We consider a number of recent domain transfer methods from machine learning, including some that are novel, and evaluate them on genomic sequence data from model organisms of varying evolutionary distance. We find that in cases where the organisms are not closely related, the use of domain adaptation methods can help improve classification performance. Papers published at the Neural Information Processing Systems Conference.

artificial intelligence, domain adaptation algorithm, health & medicine, (4 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Variational PSOM: Deep Probabilistic Clustering with Self-Organizing Maps

Manduchi, Laura, Hüser, Matthias, Rätsch, Gunnar, Fortuin, Vincent

arXiv.org Machine LearningOct-3-2019

Generating visualizations and interpretations from high-dimensional data is a common problem in many fields. Two key approaches for tackling this problem are clustering and representation learning. There are very performant deep clustering models on the one hand and interpretable representation learning techniques, often relying on latent topological structures such as self-organizing maps, on the other hand. However, current methods do not yet successfully combine these two approaches. We present a new deep architecture for probabilistic clustering, VarPSOM, and its extension to time series data, VarTPSOM. We show that they achieve superior clustering performance compared to current deep clustering methods on static MNIST/Fashion-MNIST data as well as medical time series, while inducing an interpretable representation. Moreover, on the medical time series, VarTPSOM successfully predicts future trajectories in the original data space.

deep learning, neural network, representation, (20 more...)

arXiv.org Machine Learning

1910.0159

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Multiple Instance Learning for Taxonomic Classification of Metagenomic read sets

Georgiou, Andreas, Fortuin, Vincent, Mustafa, Harun, Rätsch, Gunnar

arXiv.org Machine LearningSep-28-2019

Metagenomic studies have increasingly utilized sequencing technologies in order to analyze DNA fragments found in environmental samples. It can provide useful insights for studying the interactions between hosts and microbes, infectious disease proliferation, and novel species discovery. One important step in this analysis is the taxonomic classification of those DNA fragments. Of particular interest is the determination of the distribution of the taxa of microbes in metagenomic samples. Recent attempts using deep learning focus on architectures that classify single DNA reads independently from each other. In this work, we attempt to solve the task of directly predicting the distribution over the taxa of whole metagenomic read sets. We formulate this task as a Multiple Instance Learning (MIL) problem. We extend architectures used in single-read taxonomic classification with two different types of permutation-invariant MIL pooling layers: a) deepsets and b) attention-based pooling. We illustrate that our architecture can exploit the co-occurrence of species in metagenomic read sets and outperforms the single-read architectures in predicting the distribution over the taxa at higher taxonomic ranks.

deep learning, neural network, taxonomic rank, (21 more...)

arXiv.org Machine Learning

1909.13146

Country: Europe > Switzerland > Zürich > Zürich (0.15)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Multivariate Time Series Imputation with Variational Autoencoders

Fortuin, Vincent, Rätsch, Gunnar, Mandt, Stephan

arXiv.org Machine LearningJul-12-2019

Time series are often associated with missing values, for instance due to faulty measurement devices, partially observed states, or costly measurement procedures [15]. These missing values impair the usefulness and interpretability of the data, leading to the problem of data imputation: estimating those missing values from the observed ones [38]. Multivariate time series, consisting of multiple correlated univariate time series or channels, give rise to two distinct ways of imputing missing information: (1) by exploiting temporal correlations within each channel, and (2) by exploiting correlations across channels, for example by using lowerdimensional representations of the data. For instance in a medical setting, if the blood pressure of a patient is unobserved, it can be informative that the heart rate at the current time is higher than normal and that the blood pressure was also elevated an hour ago. An ideal imputation model for multivariate time series should therefore take both of these sources of information into account.

deep learning, time series, vascular disease, (22 more...)

arXiv.org Machine Learning

1907.04155

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Diagnostic Medicine > Vital Signs (0.54)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback