Goto

Collaborating Authors

 Technology


Modeling Time Varying Systems Using Hidden Control Neural Architecture

Neural Information Processing Systems

This paper introduces a generalization of the layered neural network that can implement a time-varying nonlinear mapping between its observable input and output. The variation of the network's mapping is due to an additional, hidden control input, while the network parameters remain unchanged. We proposed an algorithm for finding the network parameters and the hidden control sequence from a training set of examples of observable input and output. This algorithm implements an approximate maximum likelihood estimation of parameters of an equivalent statistical model, when only the dominant control sequence is taken into account. The conceptual difference between the proposed model and the HMM is that in the HMM approach, the observable data in each of the states is modeled as though it was produced by a memoryless source, and a parametric description of this source is obtained during training, while in the proposed model the observations in each state are produced by a nonlinear dynamical system driven by noise, and both the parametric form of the dynamics and the noise are estimated. The perfonnance of the model was illustrated for the tasks of nonlinear time-varying system modeling and continuously spoken digit recognition. The reported results show the potential of this model for providing high performance speech recognition capability. Acknowledgment Special thanks are due to N. Merhav for numerous comments and helpful discussions.


Associative Memory in a Network of `Biological' Neurons

Neural Information Processing Systems

The Hopfield network (Hopfield, 1982,1984) provides a simple model of an associative memory in a neuronal structure. This model, however, is based on highly artificial assumptions, especially the use of formal-two state neurons (Hopfield, 1982) or graded-response neurons (Hopfield, 1984).


Spherical Units as Dynamic Consequential Regions: Implications for Attention, Competition and Categorization

Neural Information Processing Systems

Spherical Units can be used to construct dynamic reconfigurable consequential regions, the geometric bases for Shepard's (1987) theory of stimulus generalization in animals and humans. We derive from Shepard's (1987) generalization theory a particular multi-layer network with dynamic (centers and radii) spherical regions which possesses a specific mass function (Cauchy). This learning model generalizes the configural-cue network model (Gluck & Bower 1988): (1) configural cues can be learned and do not require pre-wiring the power-set of cues, (2) Consequential regions are continuous rather than discrete and (3) Competition amoungst receptive fields is shown to be increased by the global extent of a particular mass function (Cauchy). We compare other common mass functions (Gaussian; used in models of Moody & Darken; 1989, Krushke, 1990) or just standard backpropogation networks with hyperplane/logistic hidden units showing that neither fare as well as models of human generalization and learning.


Kohonen Networks and Clustering: Comparative Performance in Color Clustering

Neural Information Processing Systems

"vector quantization", and "unsupervised learning" are all words which descn'be the same process: assigning a few exemplars to represent a large set of samples. Perfonning that process is the subject of a substantial body of literature. In this paper, we are concerned with the comparison of various clustering techniques to a particular, practical application: color clustering. The color clustering problem is as follows: an image is recorded in full color -- that is, three components, RED, GREEN, and BLUE, each of which has been measured to 8 bits of precision. Thus, each pixel is a 24 bit quantity. We must find a representation in which 2563 possible colors are represented by only 8 bits per pixel. That is, for a problem with 256000 variables (512 x 512) variables, assign each variable to one of only 256 classes. The color clustering problem is currently of major economic interest since millions of display systems are sold each year which can only store 8 bits per pixel, but on which users would like to be able to display "true" color (or at least as near true color as possible). In this study, we have approached the problem using the standard techniques from the literature (including k-means -- ISODATA clustering[1,3,61, LBG[4]), competitive learning (referred to as CL herein) [2], and Kohonen feature maps [5,7,9].



CAM Storage of Analog Patterns and Continuous Sequences with 3N2 Weights

Neural Information Processing Systems

A simple architecture and algorithm for analytically guaranteed associative memory storage of analog patterns, continuous sequences, and chaotic attractors in the same network is described. A matrix inversion determines network weights, given prototype patterns to be stored.


Note on Learning Rate Schedules for Stochastic Optimization

Neural Information Processing Systems

We present and compare learning rate schedules for stochastic gradient descent, a general algorithm which includes LMS, online backpropagation and k-means clustering as special cases. We introduce "search-thenconverge" type schedules which outperform the classical constant and "running average" (1ft) schedules both in speed of convergence and quality of solution.


Leaning by Combining Memorization and Gradient Descent

Neural Information Processing Systems

We have created a radial basis function network that allocates a new computational unit whenever an unusual pattern is presented to the network. The network learns by allocating new units and adjusting the parameters of existing units. If the network performs poorly on a presented pattern, then a new unit is allocated which memorizes the response to the presented pattern. If the network performs well on a presented pattern, then the network parameters are updated using standard LMS gradient descent. For predicting the Mackey Glass chaotic time series, our network learns much faster than do those using back-propagation and uses a comparable number of synapses.


The Tempo 2 Algorithm: Adjusting Time-Delays By Supervised Learning

Neural Information Processing Systems

In this work we describe a new method that adjusts time-delays and the widths of time-windows in artificial neural networks automatically. The input of the units are weighted by a gaussian input-window over time which allows the learning rules for the delays and widths to be derived in the same way as it is used for the weights. Our results on a phoneme classification task compare well with results obtained with the TDNN by Waibel et al., which was manually optimized for the same task.


Translating Locative Prepositions

Neural Information Processing Systems

The features used in the spatial representations were abstracted from Herskovits (1986). The network was trained using the generalized delta rule (Rumelhart, Hinton, and Williams, 1986) on a set of patterns with four components, three syntactic and one semantic. The syntactic components are a pair of nouns separated by a locative preposition [NI-LP-N21, and the semantic component is a representation of the spatial relationship [SR1.