Goto

Collaborating Authors

 Generative AI


Deep Generative Models with Learnable Knowledge Constraints

Neural Information Processing Systems

The broad set of deep generative models (DGMs) has achieved remarkable advances. However, it is often difficult to incorporate rich structured domain knowledge with the end-to-end DGMs. Posterior regularization (PR) offers a principled framework to impose structured constraints on probabilistic models, but has limited applicability to the diverse DGMs that can lack a Bayesian formulation or even explicit density evaluation. PR also requires constraints to be fully specified {\it a priori}, which is impractical or suboptimal for complex knowledge with learnable uncertain parts. In this paper, we establish mathematical correspondence between PR and reinforcement learning (RL), and, based on the connection, expand PR to learn constraints as the extrinsic reward in RL. The resulting algorithm is model-agnostic to apply to any DGMs, and is flexible to adapt arbitrary constraints with the model jointly. Experiments on human image generation and templated sentence generation show models with learned knowledge constraints by our algorithm greatly improve over base generative models.


Deep Generative Models for Distribution-Preserving Lossy Compression

Neural Information Processing Systems

We propose and study the problem of distribution-preserving lossy compression. Motivated by recent advances in extreme image compression which allow to maintain artifact-free reconstructions even at very low bitrates, we propose to optimize the rate-distortion tradeoff under the constraint that the reconstructed samples follow the distribution of the training data. The resulting compression system recovers both ends of the spectrum: On one hand, at zero bitrate it learns a generative model of the data, and at high enough bitrates it achieves perfect reconstruction. Furthermore, for intermediate bitrates it smoothly interpolates between learning a generative model of the training data and perfectly reconstructing the training samples. We study several methods to approximately solve the proposed optimization problem, including a novel combination of Wasserstein GAN and Wasserstein Autoencoder, and present an extensive theoretical and empirical characterization of the proposed compression systems.


Bias and Generalization in Deep Generative Models: An Empirical Study

Neural Information Processing Systems

In high dimensional settings, density estimation algorithms rely crucially on their inductive bias. Despite recent empirical success, the inductive bias of deep generative models is not well understood. In this paper we propose a framework to systematically investigate bias and generalization in deep generative models of images by probing the learning algorithm with carefully designed training datasets. By measuring properties of the learned distribution, we are able to find interesting patterns of generalization. We verify that these patterns are consistent across datasets, common models and architectures.


Deep Generative Models with Learnable Knowledge Constraints

Neural Information Processing Systems

The broad set of deep generative models (DGMs) has achieved remarkable advances. However, it is often difficult to incorporate rich structured domain knowledge with the end-to-end DGMs. Posterior regularization (PR) offers a principled framework to impose structured constraints on probabilistic models, but has limited applicability to the diverse DGMs that can lack a Bayesian formulation or even explicit density evaluation. PR also requires constraints to be fully specified {\it a priori}, which is impractical or suboptimal for complex knowledge with learnable uncertain parts. In this paper, we establish mathematical correspondence between PR and reinforcement learning (RL), and, based on the connection, expand PR to learn constraints as the extrinsic reward in RL. The resulting algorithm is model-agnostic to apply to any DGMs, and is flexible to adapt arbitrary constraints with the model jointly. Experiments on human image generation and templated sentence generation show models with learned knowledge constraints by our algorithm greatly improve over base generative models.


Flexible and accurate inference and learning for deep generative models

Neural Information Processing Systems

We introduce a new approach to learning in hierarchical latent-variable generative models called the โ€œdistributed distributional code Helmholtz machineโ€, which emphasises flexibility and accuracy in the inferential process. Like the original Helmholtz machine and later variational autoencoder algorithms (but unlike adver- sarial methods) our approach learns an explicit inference or โ€œrecognitionโ€ model to approximate the posterior distribution over the latent variables. Unlike these earlier methods, it employs a posterior representation that is not limited to a narrow tractable parametrised form (nor is it represented by samples). To train the genera- tive and recognition models we develop an extended wake-sleep algorithm inspired by the original Helmholtz machine. This makes it possible to learn hierarchical latent models with both discrete and continuous variables, where an accurate poste- rior representation is essential. We demonstrate that the new algorithm outperforms current state-of-the-art methods on synthetic, natural image patch and the MNIST data sets.


Semi-crowdsourced Clustering with Deep Generative Models

Neural Information Processing Systems

We consider the semi-supervised clustering problem where crowdsourcing provides noisy information about the pairwise comparisons on a small subset of data, i.e., whether a sample pair is in the same cluster. We propose a new approach that includes a deep generative model (DGM) to characterize low-level features of the data, and a statistical relational model for noisy pairwise annotations on its subset. The two parts share the latent variables. To make the model automatically trade-off between its complexity and fitting data, we also develop its fully Bayesian variant. The challenge of inference is addressed by fast (natural-gradient) stochastic variational inference algorithms, where we effectively combine variational message passing for the relational part and amortized learning of the DGM under a unified framework. Empirical results on synthetic and real-world datasets show that our model outperforms previous crowdsourced clustering methods.


2019 Preview: AI to best humans at one of world's most complex games

New Scientist

Gamers everywhere were watching as OpenAI, an artificial intelligence lab co-founded by Elon Musk, pitted a team of bots against some of the world's best Dota 2 players at an annual tournament back in June. Machines had been on a winning streak. In 2016, DeepMind's AI mastered Go. In 2017, a poker-playing bot called Libratus, developed by a team at Carnegie Mellon University in Pennsylvania, won a professional Heads-Up No-Limit Texas Hold'Em tournament.


OpenAI Founder: Short-Term AGI Is a Serious Possibility

#artificialintelligence

Artificial general intelligence (AGI) is the long-range, human-intelligence-level target of contemporary AI technology. It's believed AGI has the potential to meet basic human needs globally, end poverty, cure diseases, extend life, and even mitigate climate change. In short, AGI is the tech that could not only save the world, but build a utopia. While many AI experts believe AGI is still a far-fetched fantasy unachievable with existing tech, Ilya Sutskever, founder and research director of OpenAI, has a decidedly different point of view. In his keynote speech last Friday at the AI Frontiers Conference, Sutskever said "We (OpenAI) have reviewed progress in the field over the past six years. Our conclusion is near term AGI should be taken as a serious possibility."


Latent Variable Modeling for Generative Concept Representations and Deep Generative Models

arXiv.org Machine Learning

Latent representations are the essence of deep generative models and determine their usefulness and power. For latent representations to be useful as generative concept representations, their latent space must support latent space interpolation, attribute vectors and concept vectors, among other things. We investigate and discuss latent variable modeling, including latent variable models, latent representations and latent spaces, particularly hierarchical latent representations and latent space vectors and geometry. Our focus is on that used in variational autoencoders and generative adversarial networks.


Fast Approximate Geodesics for Deep Generative Models

arXiv.org Machine Learning

The length of the geodesic between two data points along the Riemannian manifold, induced by a deep generative model, yields a principled measure of similarity. Applications have so far been limited to low-dimensional latent spaces, as the method is computationally demanding: it constitutes to solving a non-convex optimisation problem. Our approach is to tackle a relaxation: finding shortest paths in a finite graph of samples from the aggregate approximate posterior can be solved exactly, at greatly reduced runtime, and without notable loss in quality. The method is hence applicable to high-dimensional problems in the visual domain. We validate the approach empirically on a series of experiments using variational autoencoders applied to image data, tackling the Chair, Faces and FashionMNIST data sets.