AITopics | Frellsen, Jes

not-MIWAE: Deep Generative Modelling with Missing not at Random Data

Ipsen, Niels Bruun, Mattei, Pierre-Alexandre, Frellsen, Jes

arXiv.org Machine LearningJun-23-2020

When a missing process depends on the missing values themselves, it needs to be explicitly modelled and taken into account while doing likelihood-based inference. We present an approach for building and fitting deep latent variable models (DLVMs) in cases where the missing process is dependent on the missing data. Specifically, a deep neural network enables us to flexibly model the conditional distribution of the missingness pattern given the data. This allows for incorporating prior information about the type of missingness (e.g. self-censoring) into the model. Our inference technique, based on importance-weighted variational inference, involves maximising a lower bound of the joint likelihood. Stochastic gradients of the bound are obtained by using the reparameterisation trick both in latent space and data space. We show on various kinds of data sets and missingness patterns that explicitly modelling the missing process can be invaluable.

deep learning, missing data, neural network, (18 more...)

arXiv.org Machine Learning

2006.12871

Country: Europe > France (0.14)

Genre: Research Report (1.00)

Add feedback

(q,p)-Wasserstein GANs: Comparing Ground Metrics for Wasserstein GANs

Mallasto, Anton, Frellsen, Jes, Boomsma, Wouter, Feragen, Aasa

arXiv.org Machine LearningFeb-10-2019

Generative Adversial Networks (GANs) have made a major impact in computer vision and machine learning as generative models. Wasserstein GANs (WGANs) brought Optimal Transport (OT) theory into GANs, by minimizing the $1$-Wasserstein distance between model and data distributions as their objective function. Since then, WGANs have gained considerable interest due to their stability and theoretical framework. We contribute to the WGAN literature by introducing the family of $(q,p)$-Wasserstein GANs, which allow the use of more general $p$-Wasserstein metrics for $p\geq 1$ in the GAN learning procedure. While the method is able to incorporate any cost function as the ground metric, we focus on studying the $l^q$ metrics for $q\geq 1$. This is a notable generalization as in the WGAN literature the OT distances are commonly based on the $l^2$ ground metric. We demonstrate the effect of different $p$-Wasserstein distances in two toy examples. Furthermore, we show that the ground metric does make a difference, by comparing different $(q,p)$ pairs on the MNIST and CIFAR-10 datasets. Our experiments demonstrate that changing the ground metric and $p$ can notably improve on the common $(q,p) = (2,1)$ case.

artificial intelligence, gan, neural network, (18 more...)

arXiv.org Machine Learning

1902.03642

Country: Europe > Denmark (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate Bayesian Computation

Wiqvist, Samuel, Mattei, Pierre-Alexandre, Picchini, Umberto, Frellsen, Jes

arXiv.org Machine LearningJan-29-2019

We present a novel family of deep neural architectures, named partially exchangeable networks (PENs) that leverage probabilistic symmetries. By design, PENs are invariant to block-switch transformations, which characterize the partial exchangeability properties of conditionally Markovian processes. Moreover, we show that any block-switch invariant function has a PEN-like representation. The DeepSets architecture is a special case of PEN and we can therefore also target fully exchangeable data. We employ PENs to learn summary statistics in approximate Bayesian computation (ABC). When comparing PENs to previous deep learning methods for learning summary statistics, our results are highly competitive, both considering time series and static models. Indeed, PENs provide more reliable posterior samples even when using less training data.

deep learning, neural network, summary statistics, (21 more...)

arXiv.org Machine Learning

1901.1023

Country: Europe > Sweden (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.46)

Add feedback

Leveraging the Exact Likelihood of Deep Latent Variable Models

Mattei, Pierre-Alexandre, Frellsen, Jes

Neural Information Processing SystemsDec-31-2018

Deep latent variable models (DLVMs) combine the approximation abilities of deep neural networks and the statistical foundations of generative models. Variational methods are commonly used for inference; however, the exact likelihood of these models has been largely overlooked. The purpose of this work is to study the general properties of this quantity and to show how they can be leveraged in practice. We focus on important inferential problems that rely on the likelihood: estimation and missing data imputation. First, we investigate maximum likelihood estimation for DLVMs: in particular, we show that most unconstrained models used for continuous data have an unbounded likelihood function. This problematic behaviour is demonstrated to be a source of mode collapse. We also show how to ensure the existence of maximum likelihood estimates, and draw useful connections with nonparametric mixture models. Finally, we describe an algorithm for missing data imputation using the exact conditional likelihood of a DLVM. On several data sets, our algorithm consistently and significantly outperforms the usual imputation scheme used for DLVMs.

deep learning, likelihood, neural network, (18 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Add feedback

Leveraging the Exact Likelihood of Deep Latent Variable Models

Mattei, Pierre-Alexandre, Frellsen, Jes

Neural Information Processing SystemsDec-31-2018

Deep latent variable models (DLVMs) combine the approximation abilities of deep neural networks and the statistical foundations of generative models. Variational methods are commonly used for inference; however, the exact likelihood of these models has been largely overlooked. The purpose of this work is to study the general properties of this quantity and to show how they can be leveraged in practice. We focus on important inferential problems that rely on the likelihood: estimation and missing data imputation. First, we investigate maximum likelihood estimation for DLVMs: in particular, we show that most unconstrained models used for continuous data have an unbounded likelihood function. This problematic behaviour is demonstrated to be a source of mode collapse. We also show how to ensure the existence of maximum likelihood estimates, and draw useful connections with nonparametric mixture models. Finally, we describe an algorithm for missing data imputation using the exact conditional likelihood of a DLVM. On several data sets, our algorithm consistently and significantly outperforms the usual imputation scheme used for DLVMs.

deep learning, likelihood, neural network, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Add feedback

missIWAE: Deep Generative Modelling and Imputation of Incomplete Data

Mattei, Pierre-Alexandre, Frellsen, Jes

arXiv.org Machine LearningDec-6-2018

We present a simple technique to train deep latent variable models (DLVMs) when the training set contains missing data. Our approach is based on the importance-weighted autoencoder (IWAE) of Burda et al. (2016), and also allows single or multiple imputation of the incomplete data set. We illustrate it by training a convolutional DLVM on a static binarisation of MNIST that contains 50% of missing data. Leveraging mutiple imputations, we train a convolutional network that classifies these incomplete digits as well as complete ones.

artificial intelligence, imputation, neural network, (19 more...)

arXiv.org Machine Learning

1812.02633

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Leveraging the Exact Likelihood of Deep Latent Variable Models

Mattei, Pierre-Alexandre, Frellsen, Jes

arXiv.org Machine LearningFeb-18-2018

Deep latent variable models combine the approximation abilities of deep neural networks and the statistical foundations of generative models. The induced data distribution is an infinite mixture model whose density is extremely delicate to compute. Variational methods are consequently used for inference, following the seminal work of Rezende et al. (2014) and Kingma and Welling (2014). We study the well-posedness of the exact problem (maximum likelihood) these techniques approximatively solve. In particular, we show that most unconstrained models used for continuous data have an unbounded likelihood. This ill-posedness and the problems it causes are illustrated on real data. We also show how to insure the existence of maximum likelihood estimates, and draw useful connections with nonparametric mixture models. Furthermore, we describe an algorithm that allows to perform missing data imputation using the exact conditional likelihood of a deep latent variable model. On several real data sets, our algorithm consistently and significantly outperforms the usual imputation scheme used within deep latent variable models.

deep learning, likelihood, neural network, (19 more...)

arXiv.org Machine Learning

1802.04826

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)

Add feedback

Spherical convolutions and their application in molecular modelling

Boomsma, Wouter, Frellsen, Jes

Neural Information Processing SystemsDec-31-2017

Convolutional neural networks are increasingly used outside the domain of image analysis, in particular in various areas of the natural sciences concerned with spatial data. Such networks often work out-of-the box, and in some cases entire model architectures from image analysis can be carried over to other problem domains almost unaltered. Unfortunately, this convenience does not trivially extend to data in non-euclidean spaces, such as spherical data. In this paper, we introduce two strategies for conducting convolutions on the sphere, using either a spherical-polar grid or a grid based on the cubed-sphere representation. We investigate the challenges that arise in this setting, and extend our discussion to include scenarios of spherical volumes, with several strategies for parameterizing the radial dimension. As a proof of concept, we conclude with an assessment of the performance of spherical convolutions in the context of molecular modelling, by considering structural environments within proteins. We show that the models are capable of learning non-trivial functions in these molecular environments, and that our spherical convolutions generally outperform standard 3D convolutions in this setting. In particular, despite the lack of any domain specific feature-engineering, we demonstrate performance comparable to state-of-the-art methods in the field, which build on decades of domain-specific knowledge.

convolution, deep learning, neural network, (20 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.99)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

The Multivariate Generalised von Mises distribution: Inference and applications

Navarro, Alexandre K. W., Frellsen, Jes, Turner, Richard E.

arXiv.org Machine LearningAug-8-2017

Circular variables arise in a multitude of data-modelling contexts ranging from robotics to the social sciences, but they have been largely overlooked by the machine learning community. This paper partially redresses this imbalance by extending some standard probabilistic modelling tools to the circular domain. First we introduce a new multivariate distribution over circular variables, called the multivariate Generalised von Mises (mGvM) distribution. This distribution can be constructed by restricting and renormalising a general multivariate Gaussian distribution to the unit hyper-torus. Previously proposed multivariate circular distributions are shown to be special cases of this construction. Second, we introduce a new probabilistic model for circular regression, that is inspired by Gaussian Processes, and a method for probabilistic principal component analysis with circular hidden variables. These models can leverage standard modelling tools (e.g. covariance functions and methods for automatic relevance determination). Third, we show that the posterior distribution in these models is a mGvM distribution which enables development of an efficient variational free-energy scheme for performing approximate inference and approximate maximum-likelihood learning.

bayesian inference, health & medicine, mgvm, (17 more...)

arXiv.org Machine Learning

1602.05003

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Comparative Study of Inference Methods for Bayesian Nonnegative Matrix Factorisation

Brouwer, Thomas, Frellsen, Jes, Lió, Pietro

arXiv.org Machine LearningJul-13-2017

In this paper, we study the trade-offs of different inference approaches for Bayesian matrix factorisation methods, which are commonly used for predicting missing values, and for finding patterns in the data. In particular, we consider Bayesian nonnegative variants of matrix factorisation and tri-factorisation, and compare non-probabilistic inference, Gibbs sampling, variational Bayesian inference, and a maximum-a-posteriori approach. The variational approach is new for the Bayesian nonnegative models. We compare their convergence, and robustness to noise and sparsity of the data, on both synthetic and real-world datasets. Furthermore, we extend the models with the Bayesian automatic relevance determination prior, allowing the models to perform automatic model selection, and demonstrate its efficiency.

bayesian inference, dataset, health & medicine, (15 more...)

arXiv.org Machine Learning

1707.05147

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.94)

Technology: