Goto

Collaborating Authors

 Technology


Dynamically-Adaptive Winner-Take-All Networks

Neural Information Processing Systems

Unfortunately, convergence of normal WT A networks is extremely sensitive to the magnitudes of their weights, which must be hand-tuned and which generally only provide the right amount of inhibition across a relatively small range of initial conditions. This paper presents Dynamjcally Adaptive Winner-Telke-All (DA WTA) netw rls, which use a regulatory unit to provide the competitive inhibition to the units in the network. The DA WT A regulatory unit dynamically adjusts its level of activation during competition to provide the right amount of inhibition to differentiate between competitors and drive a single winner. This dynamic adaptation allows DA WT A networks to perform the winner-lake-all function for nearly any network size or initial condition.


Green's Function Method for Fast On-Line Learning Algorithm of Recurrent Neural Networks

Neural Information Processing Systems

The two well known learning algorithms of recurrent neural networks are the back-propagation (Rumelhart & el al., Werbos) and the forward propagation (Williams and Zipser). The main drawback of back-propagation is its off-line backward path in time for error cumulation. This violates the online requirement in many practical applications. Although the forward propagation algorithm can be used in an online manner, the annoying drawback is the heavy computation load required to update the high dimensional sensitivity matrix (0( fir) operations for each time step). Therefore, to develop a fast forward algorithm is a challenging task.


Operators and curried functions: Training and analysis of simple recurrent networks

Neural Information Processing Systems

We present a framework for programming tbe bidden unit representations of simple recurrent networks based on the use of hint units (additional targets at the output layer). We present two ways of analysing a network trained within this framework: Input patterns act as operators on the information encoded by the context units; symmetrically, patterns of activation over tbe context units act as curried functions of the input sequences. Simulations demonstrate that a network can learn to represent three different functions simultaneously and canonical discriminant analysis is used to investigate bow operators and curried functions are represented in the space of bidden unit activations.


Extracting and Learning an Unknown Grammar with Recurrent Neural Networks

Neural Information Processing Systems

We show that similar methods are appropriate for learning unknown grammars from examples of their strings. TIle training algorithm is an incremental real-time, recurrent learning (RTRL) method that computes the complete gradient and updates the weights at the end of each string.


Induction of Finite-State Automata Using Second-Order Recurrent Networks

Neural Information Processing Systems

By a method of heuristic search over the space of finite state automata with up to eight states, he was able to induce a recognizer for each of these languages (Tomita, 1982). Recognizers of finite-state languages have also been induced using first-order recurrent connectionist networks (Elman, 1990; Williams and Zipser, 1988; Cleeremans, Servan-Schreiber and McClelland, 1989). Generally speaking, these results were obtained by training the network to predict the next symbol (Cleeremans, Servan-Schreiber and McClelland, 1989; Williams and Zipser, 1988), rather than by training the network to accept or reject strings of different.lengths. Several training algorithms used an approximation to the gradient (Elman, 1990; Cleeremans, Servan-Schreiber and McClelland, 1989) by truncating the computation of the backward recurrence. The problem of inducing languages from examples has also been approached using second-order recurrent networks (Pollack, 1990; Giles et al., 1990). Using a truncated approximation to the gradient, and Tomita's training sets, Pollack reported that "none of the ideal languages were induced" (Pollack, 1990). On the other hand, a Tomita language has been induced using the complete gradient (Giles et al., 1991). This paper reports the induction of several Tomita languages and the extraction of the corresponding automata with certain differences in method from (Giles et al., 1991).


Recurrent Networks and NARMA Modeling

Neural Information Processing Systems

There exist large classes of time series, such as those with nonlinear moving average components, that are not well modeled by feedforward networks or linear models, but can be modeled by recurrent networks. We show that recurrent neural networks are a type of nonlinear autoregressive-moving average (N ARMA) model. Practical ability will be shown in the results of a competition sponsored by the Puget Sound Power and Light Company, where the recurrent networks gave the best performance on electric load forecasting. 1 Introduction This paper will concentrate on identifying types of time series for which a recurrent network provides a significantly better model, and corresponding prediction, than a feedforward network. Our main interest is in discrete time series that are parsimoniously modeled by a simple recurrent network, but for which, a feedforward neural network is highly non-parsimonious by virtue of requiring an infinite amount of past observations as input to achieve the same accuracy in prediction. Our approach is to consider predictive neural networks as stochastic models.


Network Model of State-Dependent Sequencing

Neural Information Processing Systems

A network model with temporal sequencing and state-dependent modulatory features is described. The model is motivated by neurocognitive data characterizing different states of waking and sleeping. Computer studies demonstrate how unique states of sequencing can exist within the same network under different aminergic and cholinergic modulatory influences. Relationships between state-dependent modulation, memory, sequencing and learning are discussed.


Induction of Multiscale Temporal Structure

Neural Information Processing Systems

Learning structure in temporally-extended sequences is a difficult computational problem because only a fraction of the relevant information is available at any instant. Although variants of back propagation can in principle be used to find structure in sequences, in practice they are not sufficiently powerful to discover arbitrary contingencies, especially those spanning long temporal intervals or involving high order statistics. For example, in designing a connectionist network for music composition, we have encountered the problem that the net is able to learn musical structure that occurs locally in time-e.g., relations among notes within a musical phrase-but not structure that occurs over longer time periods--e.g., relations among phrases. To address this problem, we require a means of constructing a reduced deacription of the sequence that makes global aspects more explicit or more readily detectable. I propose to achieve this using hidden units that operate with different time constants.


HARMONET: A Neural Net for Harmonizing Chorales in the Style of J. S. Bach

Neural Information Processing Systems

The chord skeleton is obtained if eighth and sixteenth notes are viewed as omitable ornamentations. Furthermore, if the chords are conceived as harmonies with certain attributes such as "inversion" or "characteristic dissonances", the chorale is reducible to its harmonic skeleton, a thoroughbass-like representation (Figure 2).