Goto

Collaborating Authors

 Technology


Multi-Layer Perceptrons with B-Spline Receptive Field Functions

Neural Information Processing Systems

Multi-layer perceptrons are often slow to learn nonlinear functions with complex local structure due to the global nature of their function approximations. It is shown that standard multi-layer perceptrons are actually a special case of a more general network formulation that incorporates B-splines into the node computations. This allows novel spline network architectures to be developed that can combine the generalization capabilities and scaling properties of global multi-layer feedforward networks with the computational efficiency and learning speed of local computational paradigms. Simulation results are presented for the well known spiral problem of Weiland and of Lang and Witbrock to show the effectiveness of the Spline Net approach.


Connectionist Implementation of a Theory of Generalization

Neural Information Processing Systems

Empirically, generalization between a training and a test stimulus falls off in close approximation to an exponential decay function of distance between the two stimuli in the "stimulus space" obtained by multidimensional scaling. Mathematically, this result is derivable from the assumption that an individual takes the training stimulus to belong to a "consequential" region that includes that stimulus but is otherwise of unknown location, size, and shape in the stimulus space (Shepard, 1987). As the individual gains additional information about the consequential region-by finding other stimuli to be consequential or nOl-the theory predicts the shape of the generalization function to change toward the function relating actual probability of the consequence to location in the stimulus space. This paper describes a natural connectionist implementation of the theory, and illustrates how implications of the theory for generalization, discrimination, and classification learning can be explored by connectionist simulation. 1 THE THEORY OF GENERALIZATION Because we never confront exactly the same situation twice, anything we have learned in any previous situation can guide us in deciding which action to take in the present situation only to the extent that the similarity between the two situations is sufficient to justify generalization of our previous learning to the present situation. Accordingly, principles of generalization must be foundational for any theory of behavior. In Shepard (1987) nonarbitrary principles of generalization were sought that would be optimum in any world in which an object, however distinct from other objects, is generally a member of some class or natural kind sharing some dispositional property of potential consequence for the individual.


Spherical Units as Dynamic Consequential Regions: Implications for Attention, Competition and Categorization

Neural Information Processing Systems

Spherical Units can be used to construct dynamic reconfigurable consequential regions, the geometric bases for Shepard's (1987) theory of stimulus generalization in animals and humans. We derive from Shepard's (1987) generalization theory a particular multi-layer network with dynamic (centers and radii) spherical regions which possesses a specific mass function (Cauchy). This learning model generalizes the configural-cue network model (Gluck & Bower 1988): (1) configural cues can be learned and do not require pre-wiring the power-set of cues, (2) Consequential regions are continuous rather than discrete and (3) Competition amoungst receptive fields is shown to be increased by the global extent of a particular mass function (Cauchy). We compare other common mass functions (Gaussian; used in models of Moody & Darken; 1989, Krushke, 1990) or just standard backpropogation networks with hyperplane/logistic hidden units showing that neither fare as well as models of human generalization and learning.



An Attractor Neural Network Model of Recall and Recognition

Neural Information Processing Systems

This work presents an Attractor Neural Network (ANN) model of Recall and Recognition. It is shown that an ANN model can qualitatively account for a wide range of experimental psychological data pertaining to the these two main aspects of memory access. Certain psychological phenomena are accounted for, including the effects of list-length, wordfrequency, presentation time, context shift, and aging. Thereafter, the probabilities of successful Recall and Recognition are estimated, in order to possibly enable further quantitative examination of the model. 1 Motivation The goal of this paper is to demonstrate that a Hopfield-based [Hop82] ANN model can qualitatively account for a wide range of experimental psychological data pertaining to the two main aspects of memory access, Recall and Recognition. Recall is defined as the ability to retrieve an item from a list of items (words) originally presented during a previous learning phase, given an appropriate cue (cued RecalQ, or spontaneously (free RecalQ. Recognition is defined as the ability to successfully acknowledge that a certain item has or has not appeared in the tutorial list learned before. The main prospects of ANN modeling is that some parameter values, that in former, 'classical' models of memory retrieval (see e.g.


Direct memory access using two cues: Finding the intersection of sets in a connectionist model

Neural Information Processing Systems

For lack of alternative models, search and decision processes have provided the dominant paradigm for human memory access using two or more cues, despite evidence against search as an access process (Humphreys, Wiles & Bain, 1990). We present an alternative process to search, based on calculating the intersection of sets of targets activated by two or more cues. Two methods of computing the intersection are presented, one using information about the possible targets, the other constraining the cue-target strengths in the memory matrix. Analysis using orthogonal vectors to represent the cues and targets demonstrates the competence of both processes, and simulations using sparse distributed representations demonstrate the performance of the latter process for tasks involving 2 and 3 cues.


Discovering Discrete Distributed Representations with Iterative Competitive Learning

Neural Information Processing Systems

Competitive learning is an unsupervised algorithm that classifies input patterns into mutually exclusive clusters. In a neural net framework, each cluster is represented by a processing unit that competes with others in a winnertake-all pool for an input pattern. I present a simple extension to the algorithm that allows it to construct discrete, distributed representations. Discrete representations are useful because they are relatively easy to analyze and their information content can readily be measured. Distributed representations are useful because they explicitly encode similarity. The basic idea is to apply competitive learning iteratively to an input pattern, and after each stage to subtract from the input pattern the component that was captured in the representation at that stage. This component is simply the weight vector of the winning unit of the competitive pool. The subtraction procedure forces competitive pools at different stages to encode different aspects of the input. The algorithm is essentially the same as a traditional data compression technique known as multistep vector quantization, although the neural net perspective suggests potentially powerful extensions to that approach.


Language Induction by Phase Transition in Dynamical Recognizers

Neural Information Processing Systems

A higher order recurrent neural network architecture learns to recognize and generate languages after being "trained" on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning process illustrates a new form of mechanical inference: Induction by phase transition. A small weight adjustment causes a "bifurcation" in the limit behavior of the network.


Exploiting Syllable Structure in a Connectionist Phonology Model

Neural Information Processing Systems

In a previous paper (Touretzky & Wheeler, 1990a) we showed how adding a clustering operation to a connectionist phonology model produced a parallel processing account of certain "iterative" phenomena. In this paper we show how the addition of a second structuring primitive, syllabification, greatly increases the power of the model. We present examples from a non-Indo-European language that appear to require rule ordering to at least a depth of four. By adding syllabification circuitry to structure the model's perception of the input string, we are able to handle these examples with only two derivational steps. We conclude that in phonology, derivation can be largely replaced by structuring.


A Short-Term Memory Architecture for the Learning of Morphophonemic Rules

Neural Information Processing Systems

In the debate over the power of connectionist models to handle linguistic phenomena, considerable attention has been focused on the learning of simple morphological rules. It is a straightforward matter in a symbolic system to specify how the meanings of a stem and a bound morpheme combine to yield the meaning of a whole word and how the form of the bound morpheme depends on the shape of the stem. In a distributed connectionist system, however, where there may be no explicit morphemes, words, or rules, things are not so simple. The most important work in this area has been that of Rumelhart and McClelland (1986), together with later extensions by Marchman and Plunkett (1989). The networks involved were trained to associate English verb stems with the corresponding past-tense forms, successfully generating both regular and irregular forms and generalizing to novel inputs.