AITopics | Coates, Adam

Collaborating Authors

Coates, Adam

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Principled Hybrids of Generative and Discriminative Domain Adaptation

Zhao, Han, Zhu, Zhenyao, Hu, Junjie, Coates, Adam, Gordon, Geoff

arXiv.org Artificial IntelligenceOct-27-2017

We propose a probabilistic framework for domain adaptation that blends both generative and discriminative modeling in a principled way. Under this framework, generative and discriminative models correspond to specific choices of the prior over parameters. This provides us a very general way to interpolate between generative and discriminative extremes through different choices of priors. By maximizing both the marginal and the conditional log-likelihoods, models derived from this framework can use both labeled instances from the source domain as well as unlabeled instances from both source and target domains. Under this framework, we show that the popular reconstruction loss of autoencoder corresponds to an upper bound of the negative marginal log-likelihoods of unlabeled instances, where marginal distributions are given by proper kernel density estimations. This provides a way to interpret the empirical success of autoencoders in domain adaptation and semi-supervised learning. We instantiate our framework using neural networks, and build a concrete model, DAuto. Empirically, we demonstrate the effectiveness of DAuto on text, image and speech datasets, showing that it outperforms related competitors when domain adaptation is possible.

artificial intelligence, deep learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

1705.09011

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California (0.14)

Industry: Government (0.95)

Add feedback

Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting

Arik, Sercan O., Kliegl, Markus, Child, Rewon, Hestness, Joel, Gibiansky, Andrew, Fougner, Chris, Prenger, Ryan, Coates, Adam

arXiv.org Artificial IntelligenceJul-4-2017

Keyword spotting (KWS) constitutes a major component of human-technology interfaces. Maximizing the detection accuracy at a low false alarm (FA) rate, while minimizing the footprint size, latency and complexity are the goals for KWS. Towards achieving them, we study Convolutional Recurrent Neural Networks (CRNNs). Inspired by large-scale state-of-the-art speech recognition systems, we combine the strengths of convolutional layers and recurrent layers to exploit local structure and long-range context. We analyze the effect of architecture parameters, and propose training strategies to improve performance. With only ~230k parameters, our CRNN model yields acceptably low latency, and achieves 97.71% accuracy at 0.5 FA/hour for 5 dB signal-to-noise ratio.

architecture, deep learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

1703.0539

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Active Learning for Speech Recognition: the Power of Gradients

Huang, Jiaji, Child, Rewon, Rao, Vinay, Liu, Hairong, Satheesh, Sanjeev, Coates, Adam

arXiv.org Machine LearningDec-9-2016

In training speech recognition systems, labeling audio clips can be expensive, and not all data is equally valuable. Active learning aims to label only the most informative samples to reduce cost. For speech recognition, confidence scores and other likelihood-based active learning methods have been shown to be effective. Gradient-based active learning methods, however, are still not well-understood. This work investigates the Expected Gradient Length (EGL) approach in active learning for end-to-end speech recognition. We justify EGL from a variance reduction perspective, and observe that EGL's measure of informativeness picks novel samples uncorrelated with confidence scores. Experimentally, we show that EGL can reduce word errors by 11\%, or alternatively, reduce the number of samples to label by 50\%, when compared to random sampling.

artificial intelligence, informativeness, speech recognition, (15 more...)

arXiv.org Machine Learning

1612.03226

Country:

North America > United States > Wisconsin (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Emergence of Object-Selective Features in Unsupervised Feature Learning

Coates, Adam, Karpathy, Andrej, Ng, Andrew Y.

Neural Information Processing SystemsDec-31-2012

Recent work in unsupervised feature learning has focused on the goal of discovering high-level features from unlabeled images. Much progress has been made in this direction, but in most cases it is still standard to use a large amount of labeled data in order to construct detectors sensitive to object classes or other complex patterns in the data. In this paper, we aim to test the hypothesis that unsupervised feature learning methods, provided with only unlabeled data, can learn high-level, invariant features that are sensitive to commonly-occurring objects. Though a handful of prior results suggest that this is possible when each object class accounts for a large fraction of the data (as in many labeled datasets), it is unclear whether something similar can be accomplished when dealing with completely unlabeled data. A major obstacle to this test, however, is scale: we cannot expect to succeed with small datasets or with small numbers of learned features. Here, we propose a large-scale feature learning system that enables us to carry out this experiment, learning 150,000 features from tens of millions of unlabeled images. Based on two scalable clustering algorithms (K-means and agglomerative clustering), we find that our simple system can discover features sensitive to a commonly occurring object class (human faces) and can also combine these into detectors invariant to significant global distortions like large translations and scale.

deep learning, neural network, simple cell, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California > Santa Clara County (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)

Add feedback

Selecting Receptive Fields in Deep Networks

Coates, Adam, Ng, Andrew Y.

Neural Information Processing SystemsDec-31-2011

Recent deep learning and unsupervised feature learning systems that learn from unlabeled data have achieved high performance in benchmarks by using extremely large architectures with many features (hidden units) at each layer. Unfortunately, for such large architectures the number of parameters usually grows quadratically in the width of the network, thus necessitating hand-coded "local receptive fields" that limit the number of connections from lower level features to higher ones (e.g., based on spatial locality). In this paper we propose a fast method to choose these connections that may be incorporated into a wide variety of unsupervised training methods. Specifically, we choose local receptive fields that group together those low-level features that are most similar to each other according to a pairwise similarity metric. This approach allows us to harness the advantages of local receptive fields (such as improved scalability, and reduced data requirements) when we do not know how to specify such receptive fields by hand or where our unsupervised training algorithm has no obvious generalization to a topographic setting. We produce results showing how this method allows us to use even simple unsupervised training algorithms to train successful multi-layered etworks that achieve state-of-the-art results on CIFAR and STL datasets: 82.0% and 60.1% accuracy, respectively.

deep learning, neural network, receptive field, (20 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

An Application of Reinforcement Learning to Aerobatic Helicopter Flight

Abbeel, Pieter, Coates, Adam, Quigley, Morgan, Ng, Andrew Y.

Neural Information Processing SystemsDec-31-2007

Autonomous helicopter flight is widely regarded to be a highly challenging control problem. This paper presents the first successful autonomous completion on a real RC helicopter of the following four aerobatic maneuvers: forward flip and sideways roll at low speed, tail-in funnel, and nose-in funnel. Our experimental results significantly extend the state of the art in autonomous helicopter flight. We used the following approach: First we had a pilot fly the helicopter to help us find a helicopter dynamics model and a reward (cost) function. Then we used a reinforcement learning (optimal control) algorithm to find a controller that is optimized for the resulting model and reward function. More specifically, we used differential dynamic programming (DDP), an extension of the linear quadratic regulator (LQR).

air transportation, artificial intelligence, helicopter, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County (0.14)

Industry:

Transportation > Air (1.00)
Aerospace & Defense > Aircraft (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

An Application of Reinforcement Learning to Aerobatic Helicopter Flight

Abbeel, Pieter, Coates, Adam, Quigley, Morgan, Ng, Andrew Y.

Neural Information Processing SystemsDec-31-2007

The control of autonomous helicopters thus provides a Challenging and important testbed for learning and control algorithms. In the "upIight flight regime" there has recently been considerable progress in autonomous helicopter

air transportation, artificial intelligence, helicopter, (20 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County (0.14)

Industry:

Transportation > Air (1.00)
Aerospace & Defense > Aircraft (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)

Add feedback