Deep Learning
Grammar Transfer in a Second Order Recurrent Neural Network
Negishi, Michiro, Hanson, Stephen J.
Furthermore, this effect persists even when the new strings violate the syntactic rule slightly as long as they are similar to the old strings [1]. It has been shown in the past studies that recurrent neural networks also have the ability to generalize previously acquired knowledge to novel inputs. For instance, Dienes et al. ([2]) showed that a neural network can generalize abstract knowledge acquired in one domain to a new domain. They trained the network to predict the next input symbol in grammatical sequences in the first domain, and showed that the network was able to learn to predict grammatical sequences in the second domain more effectively than it would have learned them without the prior learning. During the training in the second domain, they had to freeze the weights of a part of the network to prevent catastrophic forgetting. They used this simulation paradigm to emulate and analyze domain transfer, effect of similarity between training and test sequences, and the effect of n-gram information in human data. Hanson et al. ([5]) also showed that a prior learning of a grammar facilitates the learning of a new grammar in the cases where either the syntax or the vocabulary was kept constant. In this study we investigate grammar transfer by a neural network, where both syntax and vocabularies are different from the source grammar to the target grammar. Unlike Dienes et al.'s network, all weights in the network are allowed to change dur- ing the learning of the target grammar, which allows us to investigate interference as well as transfer from the source grammar to the target grammar.
Reinforcement Learning with Long Short-Term Memory
This paper presents reinforcement learning with a Long Short Term Memory recurrent neural network: RL-LSTM. Model-free RL-LSTM using Advantage(,x) learning and directed exploration can solve non-Markovian tasks with long-term dependencies between relevant events. This is demonstrated in a T-maze task, as well as in a difficult variation of the pole balancing task. 1 Introduction Reinforcement learning (RL) is a way of learning how to behave based on delayed reward signals [12]. Among the more important challenges for RL are tasks where part of the state of the environment is hidden from the agent. Such tasks are called non-Markovian tasks or Partially Observable Markov Decision Processes. Many real world tasks have this problem of hidden state. For instance, in a navigation task different positions in the environment may look the same, but one and the same action may lead to different next states or rewards. Thus, hidden state makes RL more realistic.
Grammar Transfer in a Second Order Recurrent Neural Network
Negishi, Michiro, Hanson, Stephen J.
Furthermore, this effect persists even when the new strings violate the syntactic rule slightly as long as they are similar to the old strings [1]. It has been shown in the past studies that recurrent neural networks also have the ability to generalize previously acquired knowledge to novel inputs. For instance, Dienes et al. ([2]) showed that a neural network can generalize abstract knowledge acquired in one domain to a new domain. They trained the network to predict the next input symbol in grammatical sequences in the first domain, and showed that the network was able to learn to predict grammatical sequences in the second domain more effectively than it would have learned them without the prior learning. During the training in the second domain, they had to freeze the weights of a part of the network to prevent catastrophic forgetting. They used this simulation paradigm to emulate and analyze domain transfer, effect of similarity between training and test sequences, and the effect of n-gram information in human data. Hanson et al. ([5]) also showed that a prior learning of a grammar facilitates the learning of a new grammar in the cases where either the syntax or the vocabulary was kept constant. In this study we investigate grammar transfer by a neural network, where both syntax and vocabularies are different from the source grammar to the target grammar. Unlike Dienes et al.'s network, all weights in the network are allowed to change dur- ing the learning of the target grammar, which allows us to investigate interference as well as transfer from the source grammar to the target grammar.
Reinforcement Learning with Long Short-Term Memory
This paper presents reinforcement learning with a Long Short Term Memory recurrent neural network: RL-LSTM. Model-free RL-LSTM using Advantage(,x) learning and directed exploration can solve non-Markovian tasks with long-term dependencies between relevantevents. This is demonstrated in a T-maze task, as well as in a difficult variation of the pole balancing task. 1 Introduction Reinforcement learning (RL) is a way of learning how to behave based on delayed reward signals [12]. Among the more important challenges for RL are tasks where part of the state of the environment is hidden from the agent. Such tasks are called non-Markovian tasks or Partially Observable Markov Decision Processes. Many real world tasks have this problem of hidden state. For instance, in a navigation task different positions in the environment may look the same, but one and the same action may lead to different next states or rewards. Thus, hidden state makes RL more realistic.
Rate-coded Restricted Boltzmann Machines for Face Recognition
Teh, Yee Whye, Hinton, Geoffrey E.
We describe a neurally-inspired, unsupervised learning algorithm that builds a nonlinear generative model for pairs of face images from the same individual. Individuals are then recognized by finding the highest relative probability pair among all pairs that consist of a test image and an image whose identity is known. Our method compares favorably with other methods in the literature. The generative model consists of a single layer of rate-coded, nonlinear feature detectors and it has the property that, given a data vector, the true posterior probability distribution over the feature detector activities can be inferred rapidly without iteration or approximation. The weights of the feature detectors are learned by comparing the correlations of pixel intensities and feature activations in two phases: When the network is observing real data and when it is observing reconstructions of real data generated from the feature activations.
Rate-coded Restricted Boltzmann Machines for Face Recognition
Teh, Yee Whye, Hinton, Geoffrey E.
We describe a neurally-inspired, unsupervised learning algorithm that builds a nonlinear generative model for pairs of face images from the same individual. Individuals are then recognized by finding the highest relative probability pair among all pairs that consist of a test image and an image whose identity is known. Our method compares favorably with other methods in the literature. The generative model consists of a single layer of rate-coded, nonlinear feature detectors and it has the property that, given a data vector, the true posterior probability distribution over the feature detector activities can be inferred rapidly without iteration or approximation. The weights of the feature detectors are learned by comparing the correlations of pixel intensities and feature activations in two phases: When the network is observing real data and when it is observing reconstructions of real data generated from the feature activations.
Rate-coded Restricted Boltzmann Machines for Face Recognition
Teh, Yee Whye, Hinton, Geoffrey E.
We describe a neurally-inspired, unsupervised learning algorithm that builds a nonlinear generative model for pairs of face images from the same individual. Individuals are then recognized by finding the highest relative probability pair among all pairs that consist of a test image and an image whose identity is known. Our method compares favorably with other methods in the literature. The generative model consists of a single layer of rate-coded, nonlinear feature detectors and it has the property that, given a data vector, the true posterior probability distribution over the feature detector activities can be inferred rapidly without iteration or approximation. The weights of the feature detectors are learned by comparing thecorrelations of pixel intensities and feature activations in two phases: When the network is observing real data and when it is observing reconstructions of real data generated from the feature activations.
Convergence of the Wake-Sleep Algorithm
Ikeda, Shiro, Amari, Shun-ichi, Nakahara, Hiroyuki
The W-S (Wake-Sleep) algorithm is a simple learning rule for the models with hidden variables. It is shown that this algorithm can be applied to a factor analysis model which is a linear version of the Helmholtz machine. But even for a factor analysis model, the general convergence is not proved theoretically. In this article, we describe the geometrical understanding of the W-S algorithm in contrast with the EM (Expectation Maximization) algorithm and the em algorithm. As the result, we prove the convergence of the W-S algorithm for the factor analysis model. We also show the condition for the convergence in general models.
Convergence of the Wake-Sleep Algorithm
Ikeda, Shiro, Amari, Shun-ichi, Nakahara, Hiroyuki
The W-S (Wake-Sleep) algorithm is a simple learning rule for the models with hidden variables. It is shown that this algorithm can be applied to a factor analysis model which is a linear version of the Helmholtz machine. But even for a factor analysis model, the general convergence is not proved theoretically. In this article, we describe the geometrical understanding of the W-S algorithm in contrast with the EM (Expectation Maximization) algorithm and the em algorithm. As the result, we prove the convergence of the W-S algorithm for the factor analysis model. We also show the condition for the convergence in general models.