AITopics

The contractive auto-encoder learns a representation of the input data that captures the local manifold structure around each data point, through the leading singular vectors of the Jacobian of the transformation from input to representation. The corresponding singular values specify how much local variation is plausible in directions associated with the corresponding singular vectors, while remaining in a high-density region of the input space. This paper proposes a procedure for generating samples that are consistent with the local structure captured by a contractive auto-encoder. The associated stochastic process defines a distribution from which one can sample, and which experimentally appears to converge quickly and mix well between modes, compared to Restricted Boltzmann Machines and Deep Belief Networks. The intuitions behind this procedure can also be used to train the second layer of contraction that pools lower-level features and learns to be invariant to the local directions of variation discovered in the first layer. We show that this can help learn and represent invariances present in the data and improve classification error.

artificial intelligence, machine learning, manifold, (18 more...)

1206.6434

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Sohn, Kihyuk, Lee, Honglak

Learning Invariant Representations with Local Transformations

Learning invariant representations is an important problem in machine learning and pattern recognition. In this paper, we present a novel framework of transformation-invariant feature learning by incorporating linear transformations into the feature learning algorithms. For example, we present the transformation-invariant restricted Boltzmann machine that compactly represents data by its weights and their transformations, which achieves invariance of the feature representation via probabilistic max pooling. In addition, we show that our transformation-invariant feature learning framework can also be extended to other unsupervised learning methods, such as autoencoders or sparse coding. We evaluate our method on several image classification benchmark datasets, such as MNIST variations, CIFAR-10, and STL-10, and show competitive or superior classification performance when compared to the state-of-the-art. Furthermore, our method achieves state-of-the-art performance on phone classification tasks with the TIMIT dataset, which demonstrates wide applicability of our proposed algorithms to other domains.

artificial intelligence, deep learning, machine learning, (17 more...)

1206.6418

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Goodfellow, Ian, Courville, Aaron, Bengio, Yoshua

Large-Scale Feature Learning With Spike-and-Slab Sparse Coding

We consider the problem of object recognition with a large number of classes. In order to overcome the low amount of labeled examples available in this setting, we introduce a new feature learning and extraction procedure based on a factor model we call spike-and-slab sparse coding (S3C). Prior work on S3C has not prioritized the ability to exploit parallel architectures and scale S3C to the enormous problem sizes needed for object recognition. We present a novel inference procedure for appropriate for use with GPUs which allows us to dramatically increase both the training set size and the amount of latent factors that S3C may be trained with. We demonstrate that this approach improves upon the supervised learning capabilities of both sparse coding and the spike-and-slab Restricted Boltzmann Machine (ssRBM) on the CIFAR-10 dataset. We use the CIFAR-100 dataset to demonstrate that our method scales to large numbers of classes better than previous methods. Finally, we use our method to win the NIPS 2011 Workshop on Challenges In Learning Hierarchical Models? Transfer Learning Challenge.

artificial intelligence, machine learning, sparse, (16 more...)

1206.6407

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > Scotland (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Tang, Yichuan, Salakhutdinov, Ruslan, Hinton, Geoffrey

Deep Lambertian Networks

Visual perception is a challenging problem in part due to illumination variations. A possible solution is to first estimate an illumination invariant representation before using it for recognition. The object albedo and surface normals are examples of such representations. In this paper, we introduce a multilayer generative model where the latent variables include the albedo, surface normals, and the light source. Combining Deep Belief Nets with the Lambertian reflectance assumption, our model can learn good priors over the albedo from 2D images. Illumination variations can be explained by changing only the lighting latent variable in our model. By transferring learned knowledge from similar objects, albedo and surface normals estimation from a single image is possible in our model. Experiments demonstrate that our model is able to generalize as well as improve over standard baselines in one-shot face recognition.

artificial intelligence, machine learning, surface normal, (17 more...)

1206.6445

Country:

North America > Canada > Ontario > Toronto (0.15)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Boulanger-Lewandowski, Nicolas, Bengio, Yoshua, Vincent, Pascal

Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription

We investigate the problem of modeling symbolic sequences of polyphonic music in a completely general piano-roll representation. We introduce a probabilistic model based on distribution estimators conditioned on a recurrent neural network that is able to discover temporal dependencies in high-dimensional sequences. Our approach outperforms many traditional models of polyphonic music on a variety of realistic datasets. We show how our musical language model can serve as a symbolic prior to improve the accuracy of polyphonic transcription.

artificial intelligence, machine learning, sequence, (17 more...)

1206.6392

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > Netherlands > Gelderland > Nijmegen (0.04)

Genre: Research Report (0.50)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Tang, Yichuan, Salakhutdinov, Ruslan, Hinton, Geoffrey

Deep Mixtures of Factor Analysers

arXiv.org Machine LearningJun-18-2012

An efficient way to learn deep density models that have many layers of latent variables is to learn one layer at a time using a model that has only one layer of latent variables. After learning each layer, samples from the posterior distributions for that layer are used as training data for learning the next layer. This approach is commonly used with Restricted Boltzmann Machines, which are undirected graphical models with a single hidden layer, but it can also be used with Mixtures of Factor Analysers (MFAs) which are directed graphical models. In this paper, we present a greedy layer-wise learning algorithm for Deep Mixtures of Factor Analysers (DMFAs). Even though a DMFA can be converted to an equivalent shallow MFA by multiplying together the factor loading matrices at different levels, learning and inference are much more efficient in a DMFA and the sharing of each lower-level factor loading matrix by many different higher level MFAs prevents overfitting. We demonstrate empirically that DM-FAs learn better density models than both MFAs and two types of Restricted Boltzmann Machine on a wide variety of datasets.

artificial intelligence, deep learning, machine learning, (19 more...)

1206.4635

Country:

North America > Canada > Ontario > Toronto (0.16)
North America > United States > California > Orange County > Irvine (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

arXiv.org Artificial IntelligenceApr-5-2012

Learning to relate images: Mapping units, complex cells and simultaneous eigenspaces

Memisevic, Roland

A fundamental operation in many vision tasks, including motion understanding, stereopsis, visual odometry, or invariant recognition, is establishing correspondences between images or between images and data from other modalities. We present an analysis of the role that multiplicative interactions play in learning such correspondences, and we show how learning and inferring relationships between images can be viewed as detecting rotations in the eigenspaces shared among a set of orthogonal matrices. We review a variety of recent multiplicative sparse coding methods in light of this observation. We also review how the squaring operation performed by energy models and by models of complex cells can be thought of as a way to implement multiplicative interactions. This suggests that the main utility of including complex cells in computational models of vision may be that they can encode relations not invariances.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1110.0107

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
(2 more...)

Goodfellow, Ian J., Courville, Aaron, Bengio, Yoshua

Spike-and-Slab Sparse Coding for Unsupervised Feature Discovery

arXiv.org Machine LearningApr-3-2012

We consider the problem of using a factor model we call {\em spike-and-slab sparse coding} (S3C) to learn features for a classification task. The S3C model resembles both the spike-and-slab RBM and sparse coding. Since exact inference in this model is intractable, we derive a structured variational inference procedure and employ a variational EM training algorithm. Prior work on approximate inference for this model has not prioritized the ability to exploit parallel architectures and scale to enormous problem sizes. We present an inference procedure appropriate for use with GPUs which allows us to dramatically increase both the training set size and the amount of latent factors. We demonstrate that this approach improves upon the supervised learning capabilities of both sparse coding and the ssRBM on the CIFAR-10 dataset. We evaluate our approach's potential for semi-supervised learning on subsets of CIFAR-10. We demonstrate state-of-the art self-taught learning performance on the STL-10 dataset and use our method to win the NIPS 2011 Workshop on Challenges In Learning Hierarchical Models' Transfer Learning Challenge.

artificial intelligence, machine learning, sparse, (16 more...)

1201.3382

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Asia > Middle East > Jordan (0.05)
North America > United States > Colorado > El Paso County > Colorado Springs (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Montavon, Grégoire, Müller, Klaus-Robert

Learning Feature Hierarchies with Centered Deep Boltzmann Machines

arXiv.org Artificial IntelligenceMar-16-2012

Deep Boltzmann machines are in principle powerful models for extracting the hierarchical structure of data. Unfortunately, attempts to train layers jointly (without greedy layer-wise pretraining) have been largely unsuccessful. We propose a modification of the learning algorithm that initially recenters the output of the activation functions to zero. This modification leads to a better conditioned Hessian and thus makes learning easier. We test the algorithm on real data and demonstrate that our suggestion, the centered deep Boltzmann machine, learns a hierarchy of increasingly abstract representations and a better generative model of data.

artificial intelligence, boltzmann machine, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-642-35289-8_33

1203.3783

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Berlin (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

arXiv.org Machine LearningMar-15-2012

Robust LogitBoost and Adaptive Base Class (ABC) LogitBoost

Li, Ping

Logitboost is an influential boosting algorithm for classification. In this paper, we develop robust logitboost to provide an explicit formulation of tree-split criterion for building weak learners (regression trees) for logitboost. This formulation leads to a numerically stable implementation of logitboost. We then propose abc-logitboost for multi-class classification, by combining robust logitboost with the prior work of abc-boost. Previously, abc-boost was implemented as abc-mart using the mart algorithm. Our extensive experiments on multi-class classification compare four algorithms: mart, abcmart, (robust) logitboost, and abc-logitboost, and demonstrate the superiority of abc-logitboost. Comparisons with other learning methods including SVM and deep learning are also available through prior publications.

algorithm, artificial intelligence, machine learning, (18 more...)

1203.3491

Country: North America > United States > New York > Tompkins County > Ithaca (0.04)

Genre:

Research Report > New Finding (0.69)
Research Report > Experimental Study (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)