Goto

Collaborating Authors

 Deep Learning


Joint Stochastic Approximation learning of Helmholtz Machines

arXiv.org Machine Learning

Though with progress, model learning and performing posterior inference still remains a common challenge for using deep generative models, especially for handling discrete hidden variables. This paper is mainly concerned with algorithms for learning Helmholz machines, which is characterized by pairing the generative model with an auxiliary inference model. A common drawback of previous learning algorithms is that they indirectly optimize some bounds of the targeted marginal log-likelihood. In contrast, we successfully develop a new class of algorithms, based on stochastic approximation (SA) theory of the Robbins-Monro type, to directly optimize the marginal log-likelihood and simultaneously minimize the inclusive KL-divergence. The resulting learning algorithm is thus called joint SA (JSA). Moreover, we construct an effective MCMC operator for JSA. Our results on the MNIST datasets demonstrate that the JSA's performance is consistently superior to that of competing algorithms like RWS, for learning a range of difficult models.


A Comparison between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition

arXiv.org Machine Learning

We study large-scale kernel methods for acoustic modeling and compare to DNNs on performance metrics related to both acoustic modeling and recognition. Measuring perplexity and frame-level classification accuracy, kernel-based acoustic models are as effective as their DNN counterparts. However, on token-error-rates DNN models can be significantly better. We have discovered that this might be attributed to DNN's unique strength in reducing both the perplexity and the entropy of the predicted posterior probabilities. Motivated by our findings, we propose a new technique, entropy regularized perplexity, for model selection. This technique can noticeably improve the recognition performance of both types of models, and reduces the gap between them. While effective on Broadcast News, this technique could be also applicable to other tasks.


Mean-Field Inference in Gaussian Restricted Boltzmann Machine

arXiv.org Machine Learning

A Gaussian restricted Boltzmann machine (GRBM) is a Boltzmann machine defined on a bipartite graph and is an extension of usual restricted Boltzmann machines. A GRBM consists of two different layers: a visible layer composed of continuous visible variables and a hidden layer composed of discrete hidden variables. In this paper, we derive two different inference algorithms for GRBMs based on the naive mean-field approximation (NMFA). One is an inference algorithm for whole variables in a GRBM, and the other is an inference algorithm for partial variables in a GBRBM. We compare the two methods analytically and numerically and show that the latter method is better.


Machine Learning and Personal Genome Informatics Contribute to Happiness Sciences and Wellbeing Computing

AAAI Conferences

Two big recent revolutions: machine learning technologies; such as โ€œdeep learningโ€ in Artificial Intelligence (AI), and personal genome informatics in biomedical science, provide us with new opportunities for understanding human happiness. Our ongoing important challenges are to discover our own truly meaningful personal happiness with the aid of AI and personal genome technologies. We have been developing a personal genome information agent entitled MyFinder, which supports searching for our inherited talents and maximizes our potential for a meaningful life. In the MyFinder project, we have provided a crowd-sourced DIY (Do it yourself) genomics research platform and conducted various โ€œcitizen scienceโ€ projects in health and wellness. In this paper, we discuss how machine learning technologies and personal genome informat-ics might contribute to happiness sciences. We introduce the โ€œSocial Intelligence Genomics and Empathy-Building Studyโ€ and report the preliminary results of applying deep learning and six other machine learning algorithms for predicting social intelligence levels from nine SNPs genetic profiles. We dis-cuss the possibilities and limitations of applying machine learning technologies for personal happiness trait prediction. We also discuss future AI challenges in the context of wellbeing computing.


Machine Learning and Personal Genome Informatics Contribute to Happiness Sciences and Wellbeing Computing

AAAI Conferences

Two big recent revolutions: machine learning technologies; such as โ€œdeep learningโ€ in Artificial Intelligence (AI), and personal genome informatics in biomedical science, provide us with new opportunities for understanding human happiness. Our ongoing important challenges are to discover our own truly meaningful personal happiness with the aid of AI and personal genome technologies. We have been developing a personal genome information agent entitled MyFinder, which supports searching for our inherited talents and maximizes our potential for a meaningful life. In the MyFinder project, we have provided a crowd-sourced DIY (Do it yourself) genomics research platform and conducted various โ€œcitizen scienceโ€ projects in health and wellness. In this paper, we discuss how machine learning technologies and personal genome informat-ics might contribute to happiness sciences. We introduce the โ€œSocial Intelligence Genomics and Empathy-Building Studyโ€ and report the preliminary results of applying deep learning and six other machine learning algorithms for predicting social intelligence levels from nine SNPs genetic profiles. We dis-cuss the possibilities and limitations of applying machine learning technologies for personal happiness trait prediction. We also discuss future AI challenges in the context of wellbeing computing.


Ensemble of Deep Convolutional Neural Networks for Learning to Detect Retinal Vessels in Fundus Images

arXiv.org Machine Learning

Pathological conditions of the retina examined through regular screening [1], [2] can heavily assist prevention of visual blindness. Fundus imaging is the most widely used modality for early screening and detection of such blindness causing diseases like diabetic retinopathy, glucoma, agerelated macular degeneration [3], hypertension and stroke induced changes [4]. Imaging of fundus has largely improved with progress from the film based photography camera to use of electronic imaging sensors; as well as red free imaging, stereo photography, hyperspectral imaging, angiography, etc. [5], thereby reducing inter-and intra-observer reporting variability. Retinal image analysis has also significantly contributed to this technological development [5], [6]. Since fundus imaging is predominantly used for first level of abnormality screening, research focus includes: (i) detection and segmentation of retinal structures (vessels, fovea, optic disc), (ii) segmentation of abnormalities, and (iii) quality quantification of images acquired to assess reporting fitness [5]. Related Work: The process of clinical reporting of retinal abnormalities is systematic and lesions are reported with respect to their location from vessels or optic disc. Computer assisted diagnosis systems are accordingly being developed to improve the clinical workflow [5].


Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks

arXiv.org Machine Learning

Deep learning algorithms have been shown to perform extremely well on many classical machine learning problems. However, recent studies have shown that deep learning, like other machine learning techniques, is vulnerable to adversarial samples: inputs crafted to force a deep neural network (DNN) to provide adversary-selected outputs. Such attacks can seriously undermine the security of the system supported by the DNN, sometimes with devastating consequences. For example, autonomous vehicles can be crashed, illicit or illegal content can bypass content filters, or biometric authentication systems can be manipulated to allow improper access. In this work, we introduce a defensive mechanism called defensive distillation to reduce the effectiveness of adversarial samples on DNNs. We analytically investigate the generalizability and robustness properties granted by the use of defensive distillation when training DNNs. We also empirically study the effectiveness of our defense mechanisms on two DNNs placed in adversarial settings. The study shows that defensive distillation can reduce effectiveness of sample creation from 95% to less than 0.5% on a studied DNN. Such dramatic gains can be explained by the fact that distillation leads gradients used in adversarial sample creation to be reduced by a factor of 10^30. We also find that distillation increases the average minimum number of features that need to be modified to create adversarial samples by about 800% on one of the DNNs we tested.


Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?

arXiv.org Machine Learning

Three important properties of a classification machinery are: (i) the system preserves the core information of the input data; (ii) the training examples convey information about unseen data; and (iii) the system is able to treat differently points from different classes. In this work we show that these fundamental properties are satisfied by the architecture of deep neural networks. We formally prove that these networks with random Gaussian weights perform a distance-preserving embedding of the data, with a special treatment for in-class and out-of-class data. Similar points at the input of the network are likely to have a similar output. The theoretical analysis of deep networks here presented exploits tools used in the compressed sensing and dictionary learning literature, thereby making a formal connection between these important topics. The derived results allow drawing conclusions on the metric learning properties of the network and their relation to its structure, as well as providing bounds on the required size of the training set such that the training examples would represent faithfully the unseen data. The results are validated with state-of-the-art trained networks.


End-to-End Attention-based Large Vocabulary Speech Recognition

arXiv.org Artificial Intelligence

ABSTRACT Many of the current state-of-the-art Large V ocabulary Continuous Speech Recognition Systems (L VCSR) are hybrids of neural networks and Hidden Markov Models (HMMs). Most of these systems contain separate components that deal with the acoustic modelling, language modelling and sequence decoding. We investigate a more direct approach in which the HMM is replaced with a Recurrent Neural Network (RNN) that performs sequence prediction directly at the character level. Alignment between the input features and the desired character sequence is learned automatically by an attention mechanism built into the RNN. For each predicted character, the attention mechanism scans the input sequence and chooses relevant frames. We propose two methods to speed up this operation: limiting the scan to a subset of most promising frames and pooling over time the information contained in neighboring frames, thereby reducing source sequence length. Index Terms -- neural networks, L VCSR, attention, speech recognition, ASR 1. INTRODUCTION Deep neural networks have become popular acoustic models for state-of-the-art large vocabulary speech recognition systems (Hinton et al., 2012a). However, in these systems most of the other components, such as Hidden Markov Models (HMMs), Gaussian Mixture Models (GMMs) andn -gram language models, are the same as in their predecessors. These combinations of neural networks and statistical models are often referred to as hybrid systems.


Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks

arXiv.org Machine Learning

Recent approaches based on artificial neural networks (ANNs) have shown promising results for short-text classification. However, many short texts occur in sequences (e.g., sentences in a document or utterances in a dialog), and most existing ANN-based systems do not leverage the preceding short texts when classifying a subsequent one. In this work, we present a model based on recurrent neural networks and convolutional neural networks that incorporates the preceding short texts. Our model achieves state-of-the-art results on three different datasets for dialog act prediction.