AITopics | Sutton, Charles

Collaborating Authors

Sutton, Charles

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Ratio Matching MMD Nets: Low dimensional projections for effective deep generative models

Srivastava, Akash, Xu, Kai, Gutmann, Michael U., Sutton, Charles

arXiv.org Machine LearningMay-31-2018

Deep generative models can learn to generate realistic-looking images on several natural image datasets, but many of the most effective methods are adversarial methods, which require careful balancing of training between a generator network and a discriminator network. Maximum mean discrepancy networks (MMD-nets) avoid this issue using the kernel trick, but unfortunately they have not on their own been able to match the performance of adversarial training. We present a new method of training MMD-nets, based on learning a mapping of samples from the data and from the model into a lower dimensional space, in which MMD training can be more effective. We call these networks ratio matching MMD networks (RMMMDnets). We train the mapping to preserve density ratios between the densities over the low-dimensional space and the original space. This ensures that matching the model distribution to the data in the low-dimensional space will also match the original distributions. We show that RM-MMDnets have better performance and better stability than recent adversarial methods for training MMD-nets.

deep learning, generator, neural network, (18 more...)

arXiv.org Machine Learning

1806.00101

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)

Add feedback

Variational Inference In Pachinko Allocation Machines

Srivastava, Akash, Sutton, Charles

arXiv.org Machine LearningApr-21-2018

The Pachinko Allocation Machine (PAM) is a deep topic model that allows representing rich correlation structures among topics by a directed acyclic graph over topics. Because of the flexibility of the model, however, approximate inference is very difficult. Perhaps for this reason, only a small number of potential PAM architectures have been explored in the literature. In this paper we present an efficient and flexible amortized variational inference method for PAM, using a deep inference network to parameterize the approximate posterior distribution in a manner similar to the variational autoencoder. Our inference method produces more coherent topics than state-of-art inference methods for PAM while being an order of magnitude faster, which allows exploration of a wider range of PAM architectures than have previously been studied.

artificial intelligence, neural network, variational inference, (17 more...)

arXiv.org Machine Learning

1804.07944

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Synthesis of Differentiable Functional Programs for Lifelong Learning

Valkov, Lazar, Chaudhari, Dipak, Srivastava, Akash, Sutton, Charles, Chaudhuri, Swarat

arXiv.org Machine LearningMar-31-2018

We present a neurosymbolic approach to the lifelong learning of algorithmic tasks that mix perception and procedural reasoning. Reusing highlevel concepts across domains and learning complex procedures are two key challenges in lifelong learning. We show that a combination of gradientbased learning and symbolic program synthesis can be a more effective response to these challenges than purely neural methods. Concretely, our approach, called HOUDINI, represents neural networks as strongly typed, end-to-end differentiable functional programs that use symbolic higher-order combinators to compose a library of neural functions. Our learning algorithm consists of: (1) a program synthesizer that performs a type-directed search over programs in this language, and decides on the library functions that should be reused and the architectures that should be used to combine them; and (2) a neural module that trains synthesized programs using stochastic gradient descent. We evaluate our approach on three algorithmic tasks. Our experiments show that our type-directed search technique is able to significantly prune the search space of programs, and that the overall approach transfers high-level concepts more effectively than monolithic neural networks as well as traditional transfer learning.

deep learning, digit, neural network, (20 more...)

arXiv.org Machine Learning

1804.00218

Country: North America > United States (0.28)

Genre:

Instructional Material (0.84)
Research Report (0.64)

Industry: Education > Educational Setting > Continuing Education (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Interpreting Deep Classifier by Visual Distillation of Dark Knowledge

Xu, Kai, Park, Dae Hoon, Yi, Chang, Sutton, Charles

arXiv.org Machine LearningMar-11-2018

Interpreting black box classifiers, such as deep networks, allows an analyst to validate a classifier before it is deployed in a high-stakes setting. A natural idea is to visualize the deep network's representations, so as to "see what the network sees". In this paper, we demonstrate that standard dimension reduction methods in this setting can yield uninformative or even misleading visualizations. Instead, we present DarkSight, which visually summarizes the predictions of a classifier in a way inspired by notion of dark knowledge. DarkSight embeds the data points into a low-dimensional space such that it is easy to compress the deep classifier into a simpler one, essentially combining model compression and dimension reduction. We compare DarkSight against t-SNE both qualitatively and quantitatively, demonstrating that DarkSight visualizations are more informative. Our method additionally yields a new confidence measure based on dark knowledge by quantifying how unusual a given vector of predictions is.

air transportation, classifier, deep learning, (17 more...)

arXiv.org Machine Learning

1803.04042

Country:

North America > United States (0.28)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.68)
Transportation > Air (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Sequence-to-Point Learning With Neural Networks for Non-Intrusive Load Monitoring

Zhang, Chaoyun (University of Edinburgh) | Zhong, Mingjun (University of Lincoln) | Wang, Zongzuo (University of Edinburgh) | Goddard, Nigel (University of Edinburgh) | Sutton, Charles (University of Edinburgh)

AAAI ConferencesFeb-8-2018

Energy disaggregation (a.k.a nonintrusive load monitoring, NILM), a single-channel blind source separation problem, aims to decompose the mains which records the whole house electricity consumption into appliance-wise readings. This problem is difficult because it is inherently unidentifiable. Recent approaches have shown that the identifiability problem could be reduced by introducing domain knowledge into the model. Deep neural networks have been shown to be a promising approach for these problems, but sliding windows are necessary to handle the long sequences which arise in signal processing problems, which raises issues about how to combine predictions from different sliding windows. In this paper, we propose sequence-to-point learning, where the input is a window of the mains and the output is a single point of the target appliance. We use convolutional neural networks to train the model. Interestingly, we systematically show that the convolutional neural networks can inherently learn the signatures of the target appliances, which are automatically added into the model to reduce the identifiability problem. We applied the proposed neural network approaches to real-world household energy data, and show that the methods achieve state-of-the-art performance, improving two standard error measures by 84% and 92%.

appliance, deep learning, neural network, (20 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Europe > United Kingdom (0.28)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning

Srivastava, Akash, Valkov, Lazar, Russell, Chris, Gutmann, Michael U., Sutton, Charles

Neural Information Processing SystemsDec-31-2017

Deep generative models provide powerful tools for distributions over complicated manifolds, such as those of natural images. But many of these methods, including generative adversarial networks (GANs), can be difficult to train, in part because they are prone to mode collapse, which means that they characterize only a few modes of the true distribution. To address this, we introduce VEEGAN, which features a reconstructor network, reversing the action of the generator by mapping from data to noise. Our training objective retains the original asymptotic consistency guarantee of GANs, and can be interpreted as a novel autoencoder loss over the noise. In sharp contrast to a traditional autoencoder over data points, VEEGAN does not require specifying a loss function over the data, but rather only over the representations, which are standard normal by assumption. On an extensive set of synthetic and real world image datasets, VEEGAN indeed resists mode collapsing to a far greater extent than other recent GAN variants, and produces more realistic samples.

artificial intelligence, mode collapse, neural network, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Oregon (0.14)

Genre:

Research Report (0.47)
Instructional Material (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning

Srivastava, Akash, Valkov, Lazar, Russell, Chris, Gutmann, Michael U., Sutton, Charles

arXiv.org Machine LearningNov-6-2017

deep learning, eegan, neural network, (18 more...)

arXiv.org Machine Learning

1705.07761

Country: North America > United States > Oregon (0.14)

Genre:

Research Report (0.51)
Instructional Material (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Learning Continuous Semantic Representations of Symbolic Expressions

Allamanis, Miltiadis, Chanthirasegaran, Pankajan, Kohli, Pushmeet, Sutton, Charles

arXiv.org Artificial IntelligenceJun-10-2017

Combining abstract, symbolic reasoning with continuous neural reasoning is a grand challenge of representation learning. As a step in this direction, we propose a new architecture, called neural equivalence networks, for the problem of learning continuous semantic representations of algebraic and logical expressions. These networks are trained to represent semantic equivalence, even of expressions that are syntactically very different. The challenge is that semantic representations must be computed in a syntax-directed manner, because semantics is compositional, but at the same time, small changes in syntax can lead to very large changes in semantics, which can be difficult for continuous neural architectures. We perform an exhaustive evaluation on the task of checking equivalence on a highly diverse class of symbolic algebraic and boolean expression types, showing that our model significantly outperforms existing architectures.

expression, neural network, text processing, (16 more...)

arXiv.org Artificial Intelligence

1611.01423

Country: Oceania > Australia (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Add feedback

Autoencoding Variational Inference For Topic Models

Srivastava, Akash, Sutton, Charles

arXiv.org Machine LearningMar-4-2017

Topic models are one of the most popular methods for learning representations of text, but a major challenge is that any change to the topic model requires mathematically deriving a new inference algorithm. A promising approach to address this problem is autoencoding variational Bayes (AEVB), but it has proven diffi- cult to apply to topic models in practice. We present what is to our knowledge the first effective AEVB based inference method for latent Dirichlet allocation (LDA), which we call Autoencoded Variational Inference For Topic Model (AVITM). This model tackles the problems caused for AEVB by the Dirichlet prior and by component collapsing. We find that AVITM matches traditional methods in accuracy with much better inference time. Indeed, because of the inference network, we find that it is unnecessary to pay the computational cost of running variational optimization on test data. Because AVITM is black box, it is readily applied to new topic models. As a dramatic illustration of this, we present a new topic model called ProdLDA, that replaces the mixture model in LDA with a product of experts. By changing only one line of code from LDA, we find that ProdLDA yields much more interpretable topics, even if LDA is trained via collapsed Gibbs sampling.

artificial intelligence, topic model, us government, (16 more...)

arXiv.org Machine Learning

1703.01488

Country:

Asia > Middle East (0.46)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Transportation (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

A Subsequence Interleaving Model for Sequential Pattern Mining

Fowkes, Jaroslav, Sutton, Charles

arXiv.org Machine LearningNov-11-2016

Recent sequential pattern mining methods have used the minimum description length (MDL) principle to define an encoding scheme which describes an algorithm for mining the most compressing patterns in a database. We present a novel subsequence interleaving model based on a probabilistic model of the sequence database, which allows us to search for the most compressing set of patterns without designing a specific encoding scheme. Our proposed algorithm is able to efficiently mine the most relevant sequential patterns and rank them using an associated measure of interestingness. The efficient inference in our model is a direct result of our use of a structural expectation-maximization framework, in which the expectation-step takes the form of a submodular optimization problem subject to a coverage constraint. We show on both synthetic and real world datasets that our model mines a set of sequential patterns with low spuriousness and redundancy, high interpretability and usefulness in real-world applications. Furthermore, we demonstrate that the quality of the patterns from our approach is comparable to, if not better than, existing state of the art sequential pattern mining algorithms.

artificial intelligence, bayesian inference, sequence, (21 more...)

arXiv.org Machine Learning

doi: 10.1145/2939672.2939787

1602.05012

Country: Europe > United Kingdom (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.54)
(3 more...)

Add feedback