Cottrell, Garrison W.
Phase-Space Learning
Tsung, Fu-Sheng, Cottrell, Garrison W.
Existing recurrent net learning algorithms are inadequate. We introduce theconceptual framework of viewing recurrent training as matching vector fields of dynamical systems in phase space. Phasespace reconstructiontechniques make the hidden states explicit, reducing temporal learning to a feed-forward problem. In short, we propose viewing iterated prediction [LF88] as the best way of training recurrent networks on deterministic signals. Using this framework, we can train multiple trajectories, insure their stability, anddesign arbitrary dynamical systems. 1 INTRODUCTION Existing general-purpose recurrent algorithms are capable of rich dynamical behavior. Unfortunately,straightforward applications of these algorithms to training fully-recurrent networks on complex temporal tasks have had much less success than their feedforward counterparts. For example, to train a recurrent network to oscillate like a sine wave (the "hydrogen atom" of recurrent learning), existing techniques such as Real Time Recurrent Learning (RTRL) [WZ89] perform suboptimally. Williams& Zipser trained a two-unit network with RTRL, with one teacher signal. One unit of the resulting network showed a distorted waveform, the other only half the desired amplitude.
Phase-Space Learning
Tsung, Fu-Sheng, Cottrell, Garrison W.
Existing recurrent net learning algorithms are inadequate. We introduce the conceptual framework of viewing recurrent training as matching vector fields of dynamical systems in phase space. Phasespace reconstruction techniques make the hidden states explicit, reducing temporal learning to a feed-forward problem. In short, we propose viewing iterated prediction [LF88] as the best way of training recurrent networks on deterministic signals. Using this framework, we can train multiple trajectories, insure their stability, and design arbitrary dynamical systems. 1 INTRODUCTION Existing general-purpose recurrent algorithms are capable of rich dynamical behavior. Unfortunately, straightforward applications of these algorithms to training fully-recurrent networks on complex temporal tasks have had much less success than their feedforward counterparts. For example, to train a recurrent network to oscillate like a sine wave (the "hydrogen atom" of recurrent learning), existing techniques such as Real Time Recurrent Learning (RTRL) [WZ89] perform suboptimally. Williams & Zipser trained a two-unit network with RTRL, with one teacher signal. One unit of the resulting network showed a distorted waveform, the other only half the desired amplitude.
Non-Linear Dimensionality Reduction
DeMers, David, Cottrell, Garrison W.
A method for creating a nonlinear encoder-decoder for multidimensional data with compact representations is presented. The commonly used technique of autoassociation is extended to allow nonlinear representations, and an objective functionwhich penalizes activations of individual hidden units is shown to result in minimum dimensional encodings with respect to allowable error in reconstruction. 1 INTRODUCTION Reducing dimensionality of data with minimal information loss is important for feature extraction, compact coding and computational efficiency. The data can be tranformed into "good" representations for further processing, constraints among feature variables may be identified, and redundancy eliminated. Many algorithms are exponential in the dimensionality of the input, thus even reduction by a single dimension may provide valuable computational savings. Autoassociating feedforward networks with one hidden layer have been shown to extract the principal components of the data (Baldi & Hornik, 1988). Such networks have been used to extract features and develop compact encodings of the data (Cottrell, Munro & Zipser, 1989). Principal Components Analysis projects the data into a linear subspace -email: demers@cs.ucsd.edu
Non-Linear Dimensionality Reduction
DeMers, David, Cottrell, Garrison W.
A method for creating a nonlinear encoder-decoder for multidimensional data with compact representations is presented. The commonly used technique of autoassociation is extended to allow nonlinear representations, and an objective function which penalizes activations of individual hidden units is shown to result in minimum dimensional encodings with respect to allowable error in reconstruction. 1 INTRODUCTION Reducing dimensionality of data with minimal information loss is important for feature extraction, compact coding and computational efficiency. The data can be tranformed into "good" representations for further processing, constraints among feature variables may be identified, and redundancy eliminated. Many algorithms are exponential in the dimensionality of the input, thus even reduction by a single dimension may provide valuable computational savings. Autoassociating feed forward networks with one hidden layer have been shown to extract the principal components of the data (Baldi & Hornik, 1988). Such networks have been used to extract features and develop compact encodings of the data (Cottrell, Munro & Zipser, 1989). Principal Components Analysis projects the data into a linear subspace -email: demers@cs.ucsd.edu
EMPATH: Face, Emotion, and Gender Recognition Using Holons
Cottrell, Garrison W., Metcalfe, Janet
EMPATH: Face, Emotion, and Gender Recognition Using Holons
Cottrell, Garrison W., Metcalfe, Janet
The network is trained to simply reproduce its input, and so can as a nonlinear version of Kohonen's (1977) auto-associator. However it must through a narrow channel of hidden units, so it must extract regularities from the during learning. Empirical analysis of the trained network showed that the span the principal subspace of the image vectors, with some noise on the component due to network nonlinearity (Cottrell & Munro, 1988).