Goto

Collaborating Authors

 Deep Learning



Rate-coded Restricted Boltzmann Machines for Face Recognition

Neural Information Processing Systems

We describe a neurally-inspired, unsupervised learning algorithm that builds a nonlinear generative model for pairs of face images from the same individual. Individuals are then recognized by finding the highest relative probability pair among all pairs that consist of a test image and an image whose identity is known. Our method compares favorably with other methods in the literature. The generative model consists of a single layer of rate-coded, nonlinear feature detectors and it has the property that, given a data vector, the true posterior probability distribution over the feature detector activities can be inferred rapidly without iteration or approximation. The weights of the feature detectors are learned by comparing the correlations of pixel intensities and feature activations in two phases: When the network is observing real data and when it is observing reconstructions of real data generated from the feature activations.


Rate-coded Restricted Boltzmann Machines for Face Recognition

Neural Information Processing Systems

We describe a neurally-inspired, unsupervised learning algorithm that builds a nonlinear generative model for pairs of face images from the same individual. Individuals are then recognized by finding the highest relative probability pair among all pairs that consist of a test image and an image whose identity is known. Our method compares favorably with other methods in the literature. The generative model consists of a single layer of rate-coded, nonlinear feature detectors and it has the property that, given a data vector, the true posterior probability distribution over the feature detector activities can be inferred rapidly without iteration or approximation. The weights of the feature detectors are learned by comparing the correlations of pixel intensities and feature activations in two phases: When the network is observing real data and when it is observing reconstructions of real data generated from the feature activations.


Rate-coded Restricted Boltzmann Machines for Face Recognition

Neural Information Processing Systems

We describe a neurally-inspired, unsupervised learning algorithm that builds a nonlinear generative model for pairs of face images from the same individual. Individuals are then recognized by finding the highest relative probability pair among all pairs that consist of a test image and an image whose identity is known. Our method compares favorably with other methods in the literature. The generative model consists of a single layer of rate-coded, nonlinear feature detectors and it has the property that, given a data vector, the true posterior probability distribution over the feature detector activities can be inferred rapidly without iteration or approximation. The weights of the feature detectors are learned by comparing thecorrelations of pixel intensities and feature activations in two phases: When the network is observing real data and when it is observing reconstructions of real data generated from the feature activations.


Convergence of the Wake-Sleep Algorithm

Neural Information Processing Systems

The W-S (Wake-Sleep) algorithm is a simple learning rule for the models with hidden variables. It is shown that this algorithm can be applied to a factor analysis model which is a linear version of the Helmholtz machine. But even for a factor analysis model, the general convergence is not proved theoretically. In this article, we describe the geometrical understanding of the W-S algorithm in contrast with the EM (Expectation Maximization) algorithm and the em algorithm. As the result, we prove the convergence of the W-S algorithm for the factor analysis model. We also show the condition for the convergence in general models.


Convergence of the Wake-Sleep Algorithm

Neural Information Processing Systems

The W-S (Wake-Sleep) algorithm is a simple learning rule for the models with hidden variables. It is shown that this algorithm can be applied to a factor analysis model which is a linear version of the Helmholtz machine. But even for a factor analysis model, the general convergence is not proved theoretically. In this article, we describe the geometrical understanding of the W-S algorithm in contrast with the EM (Expectation Maximization) algorithm and the em algorithm. As the result, we prove the convergence of the W-S algorithm for the factor analysis model. We also show the condition for the convergence in general models.


Convergence of the Wake-Sleep Algorithm

Neural Information Processing Systems

The WS (Wake-Sleep) algorithm is a simple learning rule for the models with hidden variables. It is shown that this algorithm can be applied to a factor analysis model which is a linear version of the Helmholtz machine. Buteven for a factor analysis model, the general convergence is not proved theoretically. In this article, we describe the geometrical understanding ofthe WS algorithm in contrast with the EM (Expectation Maximization) algorithm and the em algorithm. As the result, we prove the convergence of the WS algorithm for the factor analysis model. We also show the condition for the convergence in general models.


A Solution for Missing Data in Recurrent Neural Networks with an Application to Blood Glucose Prediction

Neural Information Processing Systems

We consider neural network models for stochastic nonlinear dynamical systems where measurements of the variable of interest are only available at irregular intervals i.e. most realizations are missing. Difficulties arise since the solutions for prediction and maximum likelihood learning with missing data lead to complex integrals, which even for simple cases cannot be solved analytically. In this paper we propose a specific combination of a nonlinear recurrent neural predictive model and a linear error model which leads to tractable prediction and maximum likelihood adaptation rules. In particular, the recurrent neural network can be trained using the real-time recurrent learning rule and the linear error model can be trained by an EM adaptation rule, implemented using forward-backward Kalman filter equations. The model is applied to predict the glucose/insulin metabolism of a diabetic patient where blood glucose measurements are only available a few times a day at irregular intervals.


Recurrent Neural Networks Can Learn to Implement Symbol-Sensitive Counting

Neural Information Processing Systems

Recently researchers have derived formal complexity analysis of analog computation in the setting of discrete-time dynamical systems. As an empirical constrast, training recurrent neural networks (RNNs) produces self -organized systems that are realizations of analog mechanisms. Previous work showed that a RNN can learn to process a simple context-free language (CFL) by counting. Herein, we extend that work to show that a RNN can learn a harder CFL, a simple palindrome, by organizing its resources into a symbol-sensitive counting solution, and we provide a dynamical systems analysis which demonstrates how the network: can not only count, but also copy and store counting infonnation. 1 INTRODUCTION Several researchers have recently derived results in analog computation theory in the setting of discrete-time dynamical systems(Siegelmann, 1994; Maass & Opren, 1997; Moore, 1996; Casey, 1996). For example, a dynamical recognizer (DR) is a discrete-time continuous dynamical system with a given initial starting point and a finite set of Boolean output decision functions(pollack.


A Solution for Missing Data in Recurrent Neural Networks with an Application to Blood Glucose Prediction

Neural Information Processing Systems

We consider neural network models for stochastic nonlinear dynamical systems where measurements of the variable of interest are only available at irregular intervals i.e. most realizations are missing. Difficulties arise since the solutions for prediction and maximum likelihood learning with missing data lead to complex integrals, which even for simple cases cannot be solved analytically. In this paper we propose a specific combination of a nonlinear recurrent neural predictive model and a linear error model which leads to tractable prediction and maximum likelihood adaptation rules. In particular, the recurrent neural network can be trained using the real-time recurrent learning rule and the linear error model can be trained by an EM adaptation rule, implemented using forward-backward Kalman filter equations. The model is applied to predict the glucose/insulin metabolism of a diabetic patient where blood glucose measurements are only available a few times a day at irregular intervals.