Generative AI
Why Elon Musk fears artificial intelligence
Elon Musk is usually far from a technological pessimist. From electric cars to Mars colonies, he's made his name by insisting that the future can get here faster. But when it comes to artificial intelligence, he sounds very different. Speaking at MIT in 2014, he called AI humanity's "biggest existential threat" and compared it to "summoning the demon." He reiterated those fears in an interview published Friday with Recode's Kara Swisher, though with a little less apocalyptic rhetoric.
How teaching AI to be curious helps machines learn for themselves
When playing a video game, what motivates you to carry on? This question is perhaps too broad to yield a single answer, but if you had to sum up why you accept that next quest, jump into a new level, or cave and play just one more turn, the simplest explanation might be "curiosity" -- just to see what happens next. And as it turns out, curiosity is a very effective motivator when teaching AI to play video games, too. Research published this week by artificial intelligence lab OpenAI explains how an AI agent with a sense of curiosity outperformed its predecessors playing the classic 1984 Atari game Montezuma's Revenge. Becoming skilled at Montezuma's Revenge is not a milestone equivalent to beating Go or Dota 2, but it's still a notable advance.
Anomaly Detection for imbalanced datasets with Deep Generative Models
Buitrago, Nazly Rocio Santos, Tonnaer, Loek, Menkovski, Vlado, Mavroeidis, Dimitrios
Many important data analysis applications present with severely imbalanced datasets with respect to the target variable. A typical example is medical image analysis, where positive samples are scarce, while performance is commonly estimated against the correct detection of these positive examples. We approach this challenge by formulating the problem as anomaly detection with generative models. We train a generative model without supervision on the `negative' (common) datapoints and use this model to estimate the likelihood of unseen data. A successful model allows us to detect the `positive' case as low likelihood datapoints. In this position paper, we present the use of state-of-the-art deep generative models (GAN and VAE) for the estimation of a likelihood of the data. Our results show that on the one hand both GANs and VAEs are able to separate the `positive' and `negative' samples in the MNIST case. On the other hand, for the NLST case, neither GANs nor VAEs were able to capture the complexity of the data and discriminate anomalies at the level that this task requires. These results show that even though there are a number of successes presented in the literature for using generative models in similar applications, there remain further challenges for broad successful implementation.
Deep Generative Model with Beta Bernoulli Process for Modeling and Learning Confounding Factors
Gyawali, Prashnna K, Knight, Cameron, Ghimire, Sandesh, Horacek, B. Milan, Sapp, John L., Wang, Linwei
While deep representation learning has become increasingly capable of separating task-relevant representations from other confounding factors in the data, two significant challenges remain. First, there is often an unknown and potentially infinite number of confounding factors coinciding in the data. Second, not all of these factors are readily observable. In this paper, we present a deep conditional generative model that learns to disentangle a task-relevant representation from an unknown number of confounding factors that may grow infinitely. This is achieved by marrying the representational power of deep generative models with Bayesian non-parametric factor models, where a supervised deterministic encoder learns task-related representation and a probabilistic encoder with an Indian Buffet Process (IBP) learns the unknown number of unobservable confounding factors. We tested the presented model in two datasets: a handwritten digit dataset (MNIST) augmented with colored digits and a clinical ECG dataset with significant inter-subject variations and augmented with signal artifacts. These diverse data sets highlighted the ability of the presented model to grow with the complexity of the data and identify the absence or presence of unobserved confounding factors.
Semi-unsupervised Learning of Human Activity using Deep Generative Models
Willetts, Matthew, Doherty, Aiden, Roberts, Stephen, Holmes, Chris
Here we demonstrate a new deep generative model for classification. We introduce `semi-unsupervised learning', a problem regime related to transfer learning and zero/few shot learning where, in the training data, some classes are sparsely labelled and others entirely unlabelled. Models able to learn from training data of this type are potentially of great use, as many medical datasets are `semi-unsupervised'. Our model demonstrates superior semi-unsupervised classification performance on MNIST to model M2 from Kingma and Welling (2014). We apply the model to human accelerometer data, performing activity classification and structure discovery on windows of time series data.
Semi-crowdsourced Clustering with Deep Generative Models
Luo, Yucen, Tian, Tian, Shi, Jiaxin, Zhu, Jun, Zhang, Bo
We consider the semi-supervised clustering problem where crowdsourcing provides noisy information about the pairwise comparisons on a small subset of data, i.e., whether a sample pair is in the same cluster. We propose a new approach that includes a deep generative model (DGM) to characterize low-level features of the data, and a statistical relational model for noisy pairwise annotations on its subset. The two parts share the latent variables. To make the model automatically trade-off between its complexity and fitting data, we also develop its fully Bayesian variant. The challenge of inference is addressed by fast (natural-gradient) stochastic variational inference algorithms, where we effectively combine variational message passing for the relational part and amortized learning of the DGM under a unified framework. Empirical results on synthetic and real-world datasets show that our model outperforms previous crowdsourced clustering methods.
Learning and Inferring Movement with Deep Generative Model
Jing, Mingxuan, Ma, Xiaojian, Sun, Fuchun, Liu, Huaping
Learning and inference movement is a very challenging problem due to its high dimensionality and dependency to varied environments or tasks. In this paper, we propose an effective probabilistic method for learning and inference of basic movements. The motion planning problem is formulated as learning on a directed graphic model and deep generative model is used to perform learning and inference from demonstrations. An important characteristic of this method is that it flexibly incorporates the task descriptors and context information for long-term planning and it can be combined with dynamic systems for robot control. The experimental validations on robotic approaching path planning tasks show the advantages over the base methods with limited training data.
Do Deep Generative Models Know What They Don't Know?
Nalisnick, Eric, Matsukawa, Akihiro, Teh, Yee Whye, Gorur, Dilan, Lakshminarayanan, Balaji
A neural network deployed in the wild may be asked to make predictions for inputs that were drawn from a different distribution than that of the training data. A plethora of work has demonstrated that it is easy to find or synthesize inputs for which a neural network is highly confident yet wrong. Generative models are widely viewed to be robust to such mistaken confidence as modeling the density of the input features can be used to detect novel, out-of-distribution inputs. In this paper we challenge this assumption. We find that the model density from flow-based models, VAEs and PixelCNN cannot distinguish images of common objects such as dogs, trucks, and horses (i.e. CIFAR-10) from those of house numbers (i.e. SVHN), assigning a higher likelihood to the latter when the model is trained on the former. We focus our analysis on flow-based generative models in particular since they are trained and evaluated via the exact marginal likelihood. We find such behavior persists even when we restrict the flow models to constant-volume transformations. These transformations admit some theoretical analysis, and we show that the difference in likelihoods can be explained by the location and variances of the data and the model curvature, which shows that such behavior is more general and not just restricted to the pairs of datasets used in our experiments. Our results caution against using the density estimates from deep generative models to identify inputs similar to the training distribution, until their behavior on out-of-distribution inputs is better understood.
Video games, not killer robots, might hold the future of AI V3
Most of the games that machines can now challenge humans in are strategic, but slow: Chess, Go and poker, unless played in very specific settings, have no time constraints on player moves. That is what has made the work of research group OpenAI, in online team brawler Dota 2 - which requires real-time decision-making between potentially dozens of choices in a single frame - so different. OpenAI's bots, the OpenAI Five, went head-to-head against teams of professional players at Dota 2's annual championship, The International, this August. Although the bots lost, the matches provided an insight into how reinforcement learning is changing the game when it comes to artificial intelligence. It's safe to say that AI has a reputation in gaming: many players consider a match to be an instant loss if they have to play with a bot, and a disconnect is often accompanied by "GG".
Spurious samples in deep generative models: bug or feature?
Kégl, Balázs, Cherti, Mehdi, Kazakçı, Akın
Traditional wisdom in generative modeling literature is that spurious samples that a model can generate are errors and they should be avoided. Recent research, however, has shown interest in studying or even exploiting such samples instead of eliminating them. In this paper, we ask the question whether such samples can be eliminated all together without sacrificing coverage of the generating distribution. For the class of models we consider, we experimentally demonstrate that this is not possible without losing the ability to model some of the test samples. While our results need to be confirmed on a broader set of model families, these initial findings provide partial evidence that spurious samples share structural properties with the learned dataset, which, in turn, suggests they are not simply errors but a feature of deep generative nets.