North America
A competitive modular connectionist architecture
Jacobs, Robert A., Jordan, Michael I.
We describe a multi-network, or modular, connectionist architecture that captures that fact that many tasks have structure at a level of granularity intermediate to that assumed by local and global function approximation schemes. The main innovation of the architecture is that it combines associative and competitive learning in order to learn task decompositions. A task decomposition is discovered by forcing the networks comprising the architecture to compete to learn the training patterns. As a result of the competition, different networks learn different training patterns and, thus, learn to partition the input space. The performance of the architecture on a "what" and "where" vision task and on a multi-payload robotics task are presented.
Self-organization of Hebbian Synapses in Hippocampal Neurons
Brown, Thomas H., Mainen, Zachary F., Zador, Anthony M., Claiborne, Brenda J.
We are exploring the significance of biological complexity for neuronal computation. Here we demonstrate that Hebbian synapses in realistically-modeled hippocampal pyramidal cells may give rise to two novel forms of self -organization in response to structured synaptic input. First, on the basis of the electrotonic relationships between synaptic contacts, a cell may become tuned to a small subset of its input space. Second, the same mechanisms may produce clusters of potentiated synapses across the space of the dendrites. The latter type of self-organization may be functionally significant in the presence of nonlinear dendritic conductances.
A Connectionist Learning Control Architecture for Navigation
A novel learning control architecture is used for navigation. A sophisticated test-bedis used to simulate a cylindrical robot with a sonar belt in a planar environment. The task is short-range homing in the presence ofobstacles. The robot receives no global information and assumes no comprehensive world model. Instead the robot receives only sensory information which is inherently limited. A connectionist architecture is presented which incorporates a large amount of a priori knowledge in the form of hard-wired networks, architectural constraints, and initial weights. Instead of hard-wiring static potential fields from object models, myarchitecture learnssensor-based potential fields, automatically adjusting them to avoid local minima and to produce efficient homing trajectories. It does this without object models using only sensory information. This research demonstrates the use of a large modular architecture on a difficult task.
Constructing Hidden Units using Examples and Queries
While the network loading problem for 2-layer threshold nets is NPhard when learning from examples alone (as with backpropagation), (Baum,91) has now proved that a learner can employ queries to evade the hidden unit credit assignment problem and PACload nets with up to four hidden units in polynomial time. Empirical tests show that the method can also learn far more complicated functions such as randomly generated networks with 200 hidden units. The algorithm easily approximates Wieland's 2-spirals function usinga single layer of 50 hidden units, and requires only 30 minutes of CPU time to learn 200-bit parity to 99.7% accuracy.
Modeling Time Varying Systems Using Hidden Control Neural Architecture
This paper introduces a generalization of the layered neural network that can implement a time-varying nonlinear mapping between its observable input and output. The variation of the network's mapping is due to an additional, hidden control input, while the network parameters remain unchanged. We proposed an algorithm for finding the network parameters and the hidden control sequence from a training set of examples of observable input and output. This algorithm implements an approximate maximum likelihood estimation of parameters of an equivalent statistical model, when only the dominant control sequence is taken into account. The conceptual difference between the proposed model and the HMM is that in the HMM approach, the observable data in each of the states is modeled as though it was produced by a memoryless source, and a parametric description of this source is obtained during training, while in the proposed model the observations in each state are produced by a nonlinear dynamical system driven by noise, and both the parametric form of the dynamics and the noise are estimated. The perfonnance of the model was illustrated for the tasks of nonlinear time-varying system modeling and continuously spoken digit recognition. The reported results show the potential of this model for providing high performance speech recognition capability. Acknowledgment Specialthanks are due to N. Merhav for numerous comments and helpful discussions.
Translating Locative Prepositions
The features used in the spatial representations were abstracted from Herskovits (1986). The network was trained using the generalized delta rule (Rumelhart, Hinton, and Williams, 1986) on a set of patterns with four components, three syntactic and one semantic. The syntactic components are a pair of nouns separated by a locative preposition [NI-LP-N21, and the semantic component is a representation of the spatial relationship [SR1.
Stochastic Neurodynamics
The main point of this paper is that stochastic neural networks have a mathematical structure that corresponds quite closely with that of quantum field theory. Neural network Liouvillians and Lagrangians can be derived, just as can spin Hamiltonians and Lagrangians in QFf. It remains to show the efficacy of such a description.
Distributed Recursive Structure Processing
Legendre, Geraldine, Miyata, Yoshiro, Smolensky, Paul
Harmonic grammar (Legendre, et al., 1990) is a connectionist theory of linguistic well-formed ness based on the assumption that the well-formedness of a sentence can be measured by the harmony (negative energy) of the corresponding connectionist state. Assuming a lower-level connectionist network that obeys a few general connectionist principles but is otherwise unspecified, we construct a higher-level network with an equivalent harmony function that captures the most linguistically relevant global aspects of the lower level network. In this paper, we extend the tensor product representation (Smolensky 1990) to fully recursive representations of recursively structured objects like sentences in the lower-level network. We show theoretically and with an example the power of the new technique for parallel distributed structure processing.
Remarks on Interpolation and Recognition Using Neural Nets
We consider different types of single-hidden-Iayer feedforward nets: with or without direct input to output connections, and using either threshold or sigmoidal activation functions. The main results show that direct connections in threshold nets double the recognition but not the interpolation power, while using sigmoids rather than thresholds allows (at least) doubling both. Various results are also given on VC dimension and other measures of recognition capabilities.