AITopics | Deep Learning

Collaborating Authors

Deep Learning

New computational algorithms make it possible to build neural networks with many input nodes and many layers, and distinguish "deep learning" of these networks from previous work on artificial neural nets.

News Overviews Instructional Materials AI-Alerts Classics

Deep Deconvolutional Networks for Scene Parsing

Mohan, Rahul

arXiv.org Machine LearningNov-14-2014

Scene parsing is an important and challenging prob- lem in computer vision. It requires labeling each pixel in an image with the category it belongs to. Tradition- ally, it has been approached with hand-engineered features from color information in images. Recently convolutional neural networks (CNNs), which automatically learn hierar- chies of features, have achieved record performance on the task. These approaches typically include a post-processing technique, such as superpixels, to produce the final label- ing. In this paper, we propose a novel network architecture that combines deep deconvolutional neural networks with CNNs. Our experiments show that deconvolutional neu- ral networks are capable of learning higher order image structure beyond edge primitives in comparison to CNNs. The new network architecture is employed for multi-patch training, introduced as part of this work. Multi-patch train- ing makes it possible to effectively learn spatial priors from scenes. The proposed approach yields state-of-the-art per- formance on four scene parsing datasets, namely Stanford Background, SIFT Flow, CamVid, and KITTI. In addition, our system has the added advantage of having a training system that can be completely automated end-to-end with- out requiring any post-processing.

deconvolutional layer, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

1411.4101

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Convolutional Kernel Networks

Mairal, Julien, Koniusz, Piotr, Harchaoui, Zaid, Schmid, Cordelia

arXiv.org Machine LearningNov-14-2014

An important goal in visual recognition is to devise image representations that are invariant to particular transformations. In this paper, we address this goal with a new type of convolutional neural network (CNN) whose invariance is encoded by a reproducing kernel. Unlike traditional approaches where neural networks are learned either to represent data or for solving a classification task, our network learns to approximate the kernel feature map on training data. Such an approach enjoys several benefits over classical ones. First, by teaching CNNs to be invariant, we obtain simple network architectures that achieve a similar accuracy to more complex ones, while being easy to train and robust to overfitting. Second, we bridge a gap between the neural network literature and kernels, which are natural tools to model invariance. We evaluate our methodology on visual recognition tasks where CNNs have proven to perform well, e.g., digit recognition with the MNIST dataset, and the more challenging CIFAR-10 and STL-10 datasets, where our accuracy is competitive with the state of the art.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

1406.3332

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Deep Exponential Families

Ranganath, Rajesh, Tang, Linpeng, Charlin, Laurent, Blei, David M.

arXiv.org Machine LearningNov-10-2014

We describe \textit{deep exponential families} (DEFs), a class of latent variable models that are inspired by the hidden structures used in deep neural networks. DEFs capture a hierarchy of dependencies between latent variables, and are easily generalized to many settings through exponential families. We perform inference using recent "black box" variational inference techniques. We then evaluate various DEFs on text and combine multiple DEFs into a model for pairwise recommendation data. In an extensive study, we show that going beyond one layer improves predictions for DEFs. We demonstrate that DEFs find interesting exploratory structure in large data sets, and give better predictive performance than state-of-the-art models.

artificial intelligence, exponential family, machine learning, (19 more...)

arXiv.org Machine Learning

1411.2581

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Industry:

Government (1.00)
Leisure & Entertainment (0.94)
Media > Television (0.46)
Education > Educational Setting (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Semi-Supervised Learning with Deep Generative Models

Kingma, Diederik P., Rezende, Danilo J., Mohamed, Shakir, Welling, Max

arXiv.org Machine LearningOct-31-2014

The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis. We revisit the approach to semi-supervised learning with generative models and develop new models that allow for effective generalisation from small labelled data sets to large unlabelled ones. Generative approaches have thus far been either inflexible, inefficient or non-scalable. We show that deep generative models and approximate Bayesian inference exploiting recent advances in variational methods can be used to provide significant improvements, making generative approaches highly competitive for semi-supervised learning.

artificial intelligence, inductive learning, machine learning, (14 more...)

arXiv.org Machine Learning

1406.5298

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)

Add feedback

A Comparison of learning algorithms on the Arcade Learning Environment

Defazio, Aaron, Graepel, Thore

arXiv.org Artificial IntelligenceOct-30-2014

Reinforcement learning agents have traditionally been evaluated on small toy problems. With advances in computing power and the advent of the Arcade Learning Environment, it is now possible to evaluate algorithms on diverse and difficult problems within a consistent framework. We discuss some challenges posed by the arcade learning environment which do not manifest in simpler environments. We then provide a comparison of model-free, linear learning algorithms on this challenging problem set.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

1410.862

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Alberta (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)

Genre:

Research Report (0.50)
Instructional Material (0.34)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Education (1.00)
Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Learning deep dynamical models from image pixels

Wahlström, Niklas, Schön, Thomas B., Deisenroth, Marc Peter

arXiv.org Machine LearningOct-28-2014

Modeling dynamical systems is important in many disciplines, e.g., control, robotics, or neurotechnology. Commonly the state of these systems is not directly observed, but only available through noisy and potentially high-dimensional observations. In these cases, system identification, i.e., finding the measurement mapping and the transition mapping (system dynamics) in latent space can be challenging. For linear system dynamics and measurement mappings efficient solutions for system identification are available. However, in practical applications, the linearity assumptions does not hold, requiring non-linear system identification techniques. If additionally the observations are high-dimensional (e.g., images), non-linear system identification is inherently hard. To address the problem of non-linear system identification from high-dimensional observations, we combine recent advances in deep learning and system identification. In particular, we jointly learn a low-dimensional embedding of the observation by means of deep auto-encoders and a predictive transition model in this low-dimensional space. We demonstrate that our model enables learning good predictive models of dynamical systems from pixel information only.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Machine Learning

1410.755

Country: Europe > Sweden (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Predicting Parameters in Deep Learning

Denil, Misha, Shakibi, Babak, Dinh, Laurent, Ranzato, Marc'Aurelio, de Freitas, Nando

arXiv.org Machine LearningOct-27-2014

We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small number of weights and predicting the rest. In the best case we are able to predict more than 95% of the weights of a network without any drop in accuracy.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Machine Learning

1306.0543

Country:

North America > Canada (0.28)
North America > United States (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

On the Challenges of Physical Implementations of RBMs

Dumoulin, Vincent, Goodfellow, Ian J., Courville, Aaron, Bengio, Yoshua

arXiv.org Machine LearningOct-24-2014

Restricted Boltzmann machines (RBMs) are powerful machine learning models, but learning and some kinds of inference in the model require sampling-based approximations, which, in classical digital computers, are implemented using expensive MCMC. Physical computation offers the opportunity to reduce the cost of sampling by building physical systems whose natural dynamics correspond to drawing samples from the desired RBM distribution. Such a system avoids the burn-in and mixing cost of a Markov chain. However, hardware implementations of this variety usually entail limitations such as low-precision and limited range of the parameters and restrictions on the size and topology of the RBM. We conduct software simulations to determine how harmful each of these restrictions is. Our simulations are based on the D-Wave Two computer, but the issues we investigate arise in most forms of physical computation. Our findings suggest that designers of new physical computing hardware and algorithms for physical computers should focus their efforts on overcoming the limitations imposed by the topology restrictions of currently existing physical computers.

artificial intelligence, machine learning, noise, (18 more...)

arXiv.org Machine Learning

1312.5258

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Regularizing Recurrent Networks - On Injected Noise and Norm-based Methods

Ognawala, Saahil, Bayer, Justin

arXiv.org Machine LearningOct-21-2014

Advancements in parallel processing have lead to a surge in multilayer perceptrons' (MLP) applications and deep learning in the past decades. Recurrent Neural Networks (RNNs) give additional representational power to feedforward MLPs by providing a way to treat sequential data. However, RNNs are hard to train using conventional error backpropagation methods because of the difficulty in relating inputs over many time-steps. Regularization approaches from MLP sphere, like dropout and noisy weight training, have been insufficiently applied and tested on simple RNNs. Moreover, solutions have been proposed to improve convergence in RNNs but not enough to improve the long term dependency remembering capabilities thereof. In this study, we aim to empirically evaluate the remembering and generalization ability of RNNs on polyphonic musical datasets. The models are trained with injected noise, random dropout, norm-based regularizers and their respective performances compared to well-initialized plain RNNs and advanced regularization methods like fast-dropout. We conclude with evidence that training with noise does not improve performance as conjectured by a few works in RNN optimization before ours.

artificial intelligence, machine learning, noise, (17 more...)

arXiv.org Machine Learning

1410.5684

Genre: Research Report > New Finding (0.48)

Industry:

Media > Music (0.93)
Leisure & Entertainment (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An exact mapping between the Variational Renormalization Group and Deep Learning

Mehta, Pankaj, Schwab, David J.

arXiv.org Machine LearningOct-14-2014

Deep learning is a broad set of techniques that uses multiple layers of representation to automatically learn relevant features directly from structured data. Recently, such techniques have yielded record-breaking results on a diverse set of difficult machine learning tasks in computer vision, speech recognition, and natural language processing. Despite the enormous success of deep learning, relatively little is understood theoretically about why these techniques are so successful at feature learning and compression. Here, we show that deep learning is intimately related to one of the most important and successful techniques in theoretical physics, the renormalization group (RG). RG is an iterative coarse-graining scheme that allows for the extraction of relevant features (i.e. operators) as a physical system is examined at different length scales. We construct an exact mapping from the variational renormalization group, first introduced by Kadanoff, and deep learning architectures based on Restricted Boltzmann Machines (RBMs). We illustrate these ideas using the nearest-neighbor Ising Model in one and two-dimensions. Our results suggests that deep learning algorithms may be employing a generalized RG-like scheme to learn relevant features from data.

artificial intelligence, ising model, machine learning, (17 more...)

arXiv.org Machine Learning

1410.3831

Genre: Research Report > New Finding (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback