AITopics | Deep Learning

Collaborating Authors

Deep Learning

New computational algorithms make it possible to build neural networks with many input nodes and many layers, and distinguish "deep learning" of these networks from previous work on artificial neural nets.

News Overviews Instructional Materials AI-Alerts Classics

Deep Convolutional Neural Networks Based on Semi-Discrete Frames

Wiatowski, Thomas, Bölcskei, Helmut

arXiv.org Machine LearningApr-21-2015

Deep convolutional neural networks have led to breakthrough results in practical feature extraction applications. The mathematical analysis of these networks was pioneered by Mallat, 2012. Specifically, Mallat considered so-called scattering networks based on identical semi-discrete wavelet frames in each network layer, and proved translation-invariance as well as deformation stability of the resulting feature extractor. The purpose of this paper is to develop Mallat's theory further by allowing for different and, most importantly, general semi-discrete frames (such as, e.g., Gabor frames, wavelets, curvelets, shearlets, ridgelets) in distinct network layers. This allows to extract wider classes of features than point singularities resolved by the wavelet transform. Our generalized feature extractor is proven to be translation-invariant, and we develop deformation stability results for a larger class of deformations than those considered by Mallat. For Mallat's wavelet-based feature extractor, we get rid of a number of technical conditions. The mathematical engine behind our results is continuous frame theory, which allows us to completely detach the invariance and deformation stability proofs from the particular algebraic structure of the underlying frames.

artificial intelligence, feature extractor, machine learning, (17 more...)

arXiv.org Machine Learning

1504.05487

Genre: Research Report > New Finding (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

Gap Analysis of Natural Language Processing Systems with respect to Linguistic Modality

Shukla, Vishal

arXiv.org Artificial IntelligenceApr-18-2015

Modality is one of the important components of grammar in linguistics. It lets speaker to express attitude towards, or give assessment or potentiality of state of affairs. It implies different senses and thus has different perceptions as per the context. This paper presents an account showing the gap in the functionality of the current state of art Natural Language Processing (NLP) systems. The contextual nature of linguistic modality is studied. In this paper, the works and logical approaches employed by Natural Language Processing systems dealing with modality are reviewed. It sees human cognition and intelligence as multi-layered approach that can be implemented by intelligent systems for learning. Lastly, current flow of research going on within this field is talked providing futurology.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

1504.04716

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Compositional Distributional Semantics with Long Short Term Memory

Le, Phong, Zuidema, Willem

arXiv.org Artificial IntelligenceApr-17-2015

We are proposing an extension of the recursive neural network that makes use of a variant of the long short-term memory architecture. The extension allows information low in parse trees to be stored in a memory register (the `memory cell') and used much later higher up in the parse tree. This provides a solution to the vanishing gradient problem and allows the network to capture long range dependencies. Experimental results show that our composition outperformed the traditional neural-network composition on the Stanford Sentiment Treebank.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1503.0251

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Random Forests Can Hash

Qiu, Qiang, Sapiro, Guillermo, Bronstein, Alex

arXiv.org Machine LearningApr-16-2015

Hash codes are a very efficient data representation needed to be able to cope with the ever growing amounts of data. We introduce a random forest semantic hashing scheme with information-theoretic code aggregation, showing for the first time how random forest, a technique that together with deep learning have shown spectacular results in classification, can also be extended to large-scale retrieval. Traditional random forest fails to enforce the consistency of hashes generated from each tree for the same class data, i.e., to preserve the underlying similarity, and it also lacks a principled way for code aggregation across trees. We start with a simple hashing scheme, where independently trained random trees in a forest are acting as hashing functions. We the propose a subspace model as the splitting function, and show that it enforces the hash consistency in a tree for data from the same class. We also introduce an information-theoretic approach for aggregating codes of individual trees into a single hash code, producing a near-optimal unique hash for each class. Experiments on large-scale public datasets are presented, showing that the proposed approach significantly outperforms state-of-the-art hashing methods for retrieval tasks.

artificial intelligence, machine learning, random forest, (16 more...)

arXiv.org Machine Learning

1412.5083

Country: North America > Canada (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

High-performance Kernel Machines with Implicit Distributed Optimization and Randomization

Sindhwani, Vikas, Avron, Haim

arXiv.org Machine LearningApr-16-2015

In order to fully utilize "big data", it is often required to use "big models". Such models tend to grow with the complexity and size of the training data, and do not make strong parametric assumptions upfront on the nature of the underlying statistical dependencies. Kernel methods fit this need well, as they constitute a versatile and principled statistical methodology for solving a wide range of non-parametric modelling problems. However, their high computational costs (in storage and time) pose a significant barrier to their widespread adoption in big data applications. We propose an algorithmic framework and high-performance implementation for massive-scale training of kernel-based statistical models, based on combining two key technical ingredients: (i) distributed general purpose convex optimization, and (ii) the use of randomization to improve the scalability of kernel methods. Our approach is based on a block-splitting variant of the Alternating Directions Method of Multipliers, carefully reconfigured to handle very large random feature matrices, while exploiting hybrid parallelism typically found in modern clusters of multicore machines. Our implementation supports a variety of statistical learning tasks by enabling several loss functions, regularization schemes, kernels, and layers of randomized approximations for both dense and sparse datasets, in a highly extensible framework. We evaluate the ability of our framework to learn models on data from applications, and provide a comparison against existing sequential and parallel libraries.

algorithm, artificial intelligence, machine learning, (20 more...)

arXiv.org Machine Learning

1409.094

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning

Neyshabur, Behnam, Tomioka, Ryota, Srebro, Nathan

arXiv.org Artificial IntelligenceApr-16-2015

We present experiments demonstrating that some other form of capacity control, different from network size, plays a central role in learning multi-layer feedforward networks. We argue, partially through analogy to matrix factorization, that this is an inductive bias that can help shed light on deep learning.

artificial intelligence, machine learning, regularization, (18 more...)

arXiv.org Artificial Intelligence

1412.6614

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

A Generative Model for Deep Convolutional Learning

Pu, Yunchen, Yuan, Xin, Carin, Lawrence

arXiv.org Machine LearningApr-15-2015

A generative model is developed for deep (multi-layered) convolutional dictionary learning. A novel probabilistic pooling operation is integrated into the deep model, yielding efficient bottom-up (pretraining) and top-down (refinement) probabilistic learning. Experimental results demonstrate powerful capabilities of the model to learn multi-layer features from images, and excellent classification results are obtained on the MNIST and Caltech 101 datasets.

machine learning, natural language, pixel, (15 more...)

arXiv.org Machine Learning

1504.04054

Country: North America > United States (0.15)

Genre: Research Report > New Finding (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Deep Narrow Boltzmann Machines are Universal Approximators

Montufar, Guido

arXiv.org Machine LearningApr-10-2015

We show that deep narrow Boltzmann machines are universal approximators of probability distributions on the activities of their visible units, provided they have sufficiently many hidden layers, each containing the same number of units as the visible layer. We show that, within certain parameter domains, deep Boltzmann machines can be studied as feedforward networks. We provide upper and lower bounds on the sufficient depth and width of universal approximators. These results settle various intuitions regarding undirected networks and, in particular, they show that deep narrow Boltzmann machines are at least as compact universal approximators as narrow sigmoid belief networks and restricted Boltzmann machines, with respect to the currently available bounds for those models.

artificial intelligence, machine learning, probability distribution, (17 more...)

arXiv.org Machine Learning

1411.3784

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Scheduled denoising autoencoders

Geras, Krzysztof J., Sutton, Charles

arXiv.org Machine LearningApr-10-2015

We present a representation learning method that learns features at multiple different levels of scale. Working within the unsupervised framework of denoising autoencoders, we observe that when the input is heavily corrupted during training, the network tends to learn coarse-grained features, whereas when the input is only slightly corrupted, the network tends to learn fine-grained features. This motivates the scheduled denoising autoencoder, which starts with a high level of noise that lowers as training progresses. We find that the resulting representation yields a significant boost on a later supervised task compared to the original input, or to a standard denoising autoencoder trained at a single noise level. After supervised fine-tuning our best model achieves the lowest ever reported error on the CIFAR-10 data set among permutation-invariant methods.

artificial intelligence, machine learning, noise level, (18 more...)

arXiv.org Machine Learning

1406.3269

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Zero-bias autoencoders and the benefits of co-adapting features

Konda, Kishore, Memisevic, Roland, Krueger, David

arXiv.org Machine LearningApr-8-2015

Regularized training of an autoencoder typically results in hidden unit biases that take on large negative values. We show that negative biases are a natural result of using a hidden layer whose responsibility is to both represent the input data and act as a selection mechanism that ensures sparsity of the representation. We then show that negative biases impede the learning of data distributions whose intrinsic dimensionality is high. We also propose a new activation function that decouples the two roles of the hidden layer and that allows us to learn representations on data with very high intrinsic dimensionality, where standard autoencoders typically fail. Since the decoupled activation function acts like an implicit regularizer, the model can be trained by minimizing the reconstruction error of training data, without requiring any additional regularization.

artificial intelligence, autoencoder, machine learning, (16 more...)

arXiv.org Machine Learning

1402.3337

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback