AITopics | Deep Learning

Collaborating Authors

Deep Learning

New computational algorithms make it possible to build neural networks with many input nodes and many layers, and distinguish "deep learning" of these networks from previous work on artificial neural nets.

News Overviews Instructional Materials AI-Alerts Classics

Importance Sampling for Minibatches

Csiba, Dominik, Richtárik, Peter

arXiv.org Machine LearningFeb-6-2016

Supervised learning is a widely adopted learning paradigm with important applications such as regression, classification and prediction. The most popular approach to training supervised learning models is via empirical risk minimization (ERM). In ERM, the practitioner collects data composed of example-label pairs, and seeks to identify the best predictor by minimizing the empirical risk, i.e., the average risk associated with the predictor over the training data. With ever increasing demand for accuracy of the predictors, largely due to successful industrial applications, and with ever more sophisticated models that need to trained, such as deep neural networks [8, 14], or multiclass classification [9], increasing volumes of data are used in the training phase. This leads to huge and hence extremely computationally intensive ERM problems. Batch algorithms--methods that need to look at all the data before taking a single step to update the predictor--have long been known to be prohibitively impractical to use. Typical examples of batch methods are gradient descent and classical quasi-Newton methods.

artificial intelligence, machine learning, minibatch, (19 more...)

arXiv.org Machine Learning

1602.02283

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

A Deep Learning Approach to Unsupervised Ensemble Learning

Shaham, Uri, Cheng, Xiuyuan, Dror, Omer, Jaffe, Ariel, Nadler, Boaz, Chang, Joseph, Kluger, Yuval

arXiv.org Machine LearningFeb-6-2016

We show how deep learning methods can be applied in the context of crowdsourcing and unsupervised ensemble learning. First, we prove that the popular model of Dawid and Skene, which assumes that all classifiers are conditionally independent, is {\em equivalent} to a Restricted Boltzmann Machine (RBM) with a single hidden node. Hence, under this model, the posterior probabilities of the true labels can be instead estimated via a trained RBM. Next, to address the more general case, where classifiers may strongly violate the conditional independence assumption, we propose to apply RBM-based Deep Neural Net (DNN). Experimental results on various simulated and real-world datasets demonstrate that our proposed DNN approach outperforms other state-of-the-art methods, in particular when the data violates the conditional independence assumption.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Machine Learning

1602.02285

Country: North America (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems

Lowe, Ryan, Pow, Nissan, Serban, Iulian, Pineau, Joelle

arXiv.org Artificial IntelligenceFeb-3-2016

This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both the multi-turn property of conversations in the Dialog State Tracking Challenge datasets, and the unstructured nature of interactions from microblog services such as Twitter. We also describe two neural learning architectures suitable for analyzing this dataset, and provide benchmark performance on the task of selecting the best next response.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

1506.08909

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

Greedy Deep Dictionary Learning

Tariyal, Snigdha, Majumdar, Angshul, Singh, Richa, Vatsa, Mayank

arXiv.org Machine LearningJan-31-2016

In this work we propose a new deep learning tool called deep dictionary learning. Multi-level dictionaries are learnt in a greedy fashion, one layer at a time. This requires solving a simple (shallow) dictionary learning problem, the solution to this is well known. We apply the proposed technique on some benchmark deep learning datasets. We compare our results with other deep learning tools like stacked autoencoder and deep belief network; and state of the art supervised dictionary learning tools like discriminative KSVD and label consistent KSVD. Our method yields better results than all.

artificial intelligence, dictionary learning, machine learning, (10 more...)

arXiv.org Machine Learning

1602.00203

Country: Europe > Austria (0.28)

Genre: Research Report (0.70)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Bitwise Neural Networks

Kim, Minje, Smaragdis, Paris

arXiv.org Artificial IntelligenceJan-22-2016

Based on the assumption that there exists a neural network that efficiently represents a set of Boolean functions between all binary inputs and outputs, we propose a process for developing and deploying neural networks whose weight parameters, bias terms, input, and intermediate hidden layer output signals, are all binary-valued, and require only basic bit logic for the feedforward pass. The proposed Bitwise Neural Network (BNN) is especially suitable for resource-constrained environments, since it replaces either floating or fixed-point arithmetic with significantly more efficient bitwise operations. Hence, the BNN requires for less spatial complexity, less memory bandwidth, and less power consumption in hardware. In order to design such networks, we propose to add a few training schemes, such as weight compression and noisy backpropagation, which result in a bitwise network that performs almost as well as its corresponding real-valued network. We test the proposed network on the MNIST dataset, represented using binary features, and show that BNNs result in competitive performance while offering dramatic computational savings.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1601.06071

Country: North America > United States > Illinois (0.15)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition

Dighe, Pranay, Luyet, Gil, Asaei, Afsaneh, Bourlard, Herve

arXiv.org Machine LearningJan-22-2016

Two major emerging trends, namely deep neural networks (DNN) and exemplar-based sparse modeling, are different approaches of exploiting sparsity in speech representations to achieve invariance, discrimination and noise separation [5, 4, 6]. On the other hand, speech utterances are formed as a union of words which in turn consist of phonetic components and subphonetic attributes. Each linguistic component is produced through activation of a few highly constrained articulatory mechanisms leading to generation of speech data in union of low-dimensional subspaces [7, 8, 9]. However, most existing speech classification and acoustic modeling methods do not explicitly take into account the multi-subspace structure of the data. The present study focuses on exploiting the multi-subspace lowdimensional structure of speech learned from the training data to enhance DNN based acoustic modeling of unseen test data. Hence, this also has the potential to enable domain adaptation and handling mismatch in the framework of DNN based acoustic modeling.

artificial intelligence, machine learning, posterior, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/ICASSP.2016.7472767

1601.05936

Country: Europe > Switzerland (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Speech > Acoustic Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Using Deep Learning for Detecting Spoofing Attacks on Speech Signals

Godoy, Alan, Simões, Flávio, Stuchi, José Augusto, Angeloni, Marcus de Assis, Uliani, Mário, Violato, Ricardo

arXiv.org Machine LearningJan-19-2016

It is well known that speaker verification systems are subject to spoofing attacks. The Automatic Speaker Verification Spoofing and Countermeasures Challenge -- ASVSpoof2015 -- provides a standard spoofing database, containing attacks based on synthetic speech, along with a protocol for experiments. This paper describes CPqD's systems submitted to the ASVSpoof2015 Challenge, based on deep neural networks, working both as a classifier and as a feature extraction module for a GMM and a SVM classifier. Results show the validity of this approach, achieving less than 0.5\% EER for known attacks.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Machine Learning

1508.01746

Country:

Europe (0.28)
South America > Brazil (0.14)

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding Deep Convolutional Networks

Mallat, Stéphane

arXiv.org Machine LearningJan-19-2016

Deep convolutional networks provide state of the art classifications and regressions results over many high-dimensional problems. We review their architecture, which scatters data with a cascade of linear filter weights and non-linearities. A mathematical framework is introduced to analyze their properties. Computations of invariants involve multiscale contractions, the linearization of hierarchical symmetries, and sparse separations. Applications are discussed.

artificial intelligence, coefficient, machine learning, (20 more...)

arXiv.org Machine Learning

doi: 10.1098/rsta.2015.0203

1601.0492

Genre: Research Report (0.40)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.69)
Information Technology > Artificial Intelligence > Vision (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SimpleDS: A Simple Deep Reinforcement Learning Dialogue System

Cuayáhuitl, Heriberto

arXiv.org Artificial IntelligenceJan-18-2016

Almost two decades ago, the (spoken) dialogue systems community adopted the Reinforcement Learning (RL) paradigm since it offered the possibility to treat dialogue design as an optimisation problem, and because RL-based systems can improve their performance over time with experience. Although a large number of methods have been proposed for training (spoken) dialogue systems using RL, the question of "How to train dialogue policies in an efficient, scalable and effective way across domains?" still remains as an open problem. One limitation of current approaches is the fact that RL-based dialogue systems still require high-levels of human intervention (from system developers), as opposed to automating the dialogue design. Training a system of this kind requires a system developer to provide a set of features to describe the dialogue state, a set of actions to control the interaction, and a performance function to reward or penalise the action-selection process. All of these elements have to be carefully engineered in order to learn a good dialogue policy (or policies). This suggests that one way of advancing the state-of-the-art in this field is by reducing the amount of human intervention in the dialogue design process through higher degrees of automation, i.e. by moving towards truly autonomous learning.

dialogue system, reinforcement learning, simpleds, (11 more...)

arXiv.org Artificial Intelligence

1601.04574

Country: Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Bayesian Convolutional Neural Networks with Bernoulli Approximate Variational Inference

Gal, Yarin, Ghahramani, Zoubin

arXiv.org Machine LearningJan-18-2016

Convolutional neural networks (CNNs) work well on large datasets. But labelled data is hard to collect, and in some applications larger amounts of data are not available. The problem then is how to use CNNs with small data -- as CNNs overfit quickly. We present an efficient Bayesian CNN, offering better robustness to over-fitting on small data than traditional approaches. This is by placing a probability distribution over the CNN's kernels. We approximate our model's intractable posterior with Bernoulli variational distributions, requiring no additional model parameters. On the theoretical side, we cast dropout network training as approximate inference in Bayesian neural networks. This allows us to implement our model using existing tools in deep learning with no increase in time complexity, while highlighting a negative result in the field. We show a considerable improvement in classification accuracy compared to standard techniques and improve on published state-of-the-art results for CIFAR-10.

artificial intelligence, dropout, machine learning, (19 more...)

arXiv.org Machine Learning

1506.02158

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback