Neural Information Processing Systems
Learning Representations by Recirculation
Hinton, Geoffrey E., McClelland, James L.
One criticism of back-propagation is that it requires a teacher to specify the desired output vectors. It is possible to dispense with the teacher in the case of "encoder" networks2 in which the desired output vector is identical with the input vector (see Figure 1). The purpose of an encoder network is to learn good "codes" in the intermediate, hidden units. If for, example, there are less hidden units than input units, an encoder network will perform data-compression3.
A Dynamical Approach to Temporal Pattern Processing
Stornetta, W. Scott, Hogg, Tad, Huberman, Bernardo A.
W. Scott Stornetta Stanford University, Physics Department, Stanford, Ca., 94305 Tad Hogg and B. A. Huberman Xerox Palo Alto Research Center, Palo Alto, Ca. 94304 ABSTRACT Recognizing patterns with temporal context is important for such tasks as speech recognition, motion detection and signature verification. We propose an architecture in which time serves as its own representation, and temporal context is encoded in the state of the nodes. We contrast this with the approach of replicating portions of the architecture to represent time. As one example of these ideas, we demonstrate an architecture with capacitive inputs serving as temporal feature detectors in an otherwise standard back propagation model. Experiments involving motion detection and word discrimination serve to illustrate novel features of the system.
Probabilistic Characterization of Neural Model Computations
Learning algorithms for the neural network which search for the "most probable" member of P can then be designed. Statistical tests which decide if the "true" or environmental probability distribution is in P can also be developed. Example applications of the theory to the highly nonlinear back-propagation learning algorithm, and the networks of Hopfield and Anderson are discussed. INTRODUCTION A connectionist system is a network of simple neuron-like computing elements which can store and retrieve information, and most importantly make generalizations. Using terminology suggested by Rumelhart & McClelland 1, the computing elements of a connectionist system are called units, and each unit is associated with a real number indicating its activity level. The activity level of a given unit in the system can also influence the activity level of another unit. The degree of influence between two such units is often characterized by a parameter of the system known as a connection strength. During the information retrievalprocess some subset of the units in the system are activated, and these units in turn activate neighboring units via the inter-unit connection strengths.
An Optimization Network for Matrix Inversion
Jang, Ju-Seog, Lee, Soo-Young, Shin, Sang-Yung
Box 150, Cheongryang, Seoul, Korea ABSTRACT Inverse matrix calculation can be considered as an optimization. We have demonstrated that this problem can be rapidly solved by highly interconnected simple neuron-like analog processors. A network for matrix inversion based on the concept of Hopfield's neural network was designed, and implemented with electronic hardware. With slight modifications, the network is readily applicable to solving a linear simultaneous equation efficiently. Notable features of this circuit are potential speed due to parallel processing, and robustness against variations of device parameters.
Scaling Properties of Coarse-Coded Symbol Memories
Rosenfeld, Ronald, Touretzky, David S.
DCPS' memory scheme is a modified version of the Random Receptors method [5]. The symbol space is the set of all triples over a 25 letter alphabet. Units have fixed-size receptive fields organized as 6 x 6 x 6 subspaces. Patterns are manipulated to minimize the variance in pattern size across symbols.
Analysis of Distributed Representation of Constituent Structure in Connectionist Systems
The method allows the fully distributed representation of symbolic structures: the roles in the structures, as well as the fillers for those roles, can be arbitrarily non-local. Fully and partially localized special cases reduce to existing cases of connectionist representations of structured data; the tensor product representation generalizes these and the few existing examples of fuUy distributed representations of structures. The representation saturates gracefully as larger structures are represented; it pennits recursive construction of complex representations from simpler ones; it respects the independence of the capacities to generate and maintain multiple bindings in parallel; it extends naturally to continuous structures and continuous representational patterns; it pennits values to also serve as variables; it enables analysis of the interference of symbolic structures stored in associative memories; and it leads to characterization of optimal distributed representations of roles and a recirculation algorithm for learning them. Introduction Any model of complex infonnation processing in networks of simple processors must solve the problem of representing complex structures over network elements. Connectionist models of realistic natural language processing, for example, must employ computationally adequate representations of complex sentences. Many connectionists feel that to develop connectionist systems with the computational power required by complex tasks, distributed representations must be used: an individual processing unit must participate in the representation of multiple items, and each item must be represented as a pattern of activity of multiple processors. Connectionist models have used more or less distributed representations of more or less complex structures, but little if any general analysis of the problem of distributed representation of complex infonnation has been carried out This paper reports results of an analysis of a general method called the tensor product representation.
Connectivity Versus Entropy
Yaser S. Abu-Mostafa California Institute of Technology Pasadena, CA 91125 ABSTRACT How does the connectivity of a neural network (number of synapses per neuron) relate to the complexity of the problems it can handle (measured by the entropy)? Switching theory would suggest no relation at all, since all Boolean functions can be implemented using a circuit with very low connectivity (e.g., using two-input NAND gates). However, for a network that learns a problem from examples using a local learning rule, we prove that the entropy of the problem becomes a lower bound for the connectivity of the network. INTRODUCTION The most distinguishing feature of neural networks is their ability to spontaneously learnthe desired function from'training' samples, i.e., their ability to program themselves. Clearly, a given neural network cannot just learn any function, there must be some restrictions on which networks can learn which functions.
High Order Neural Networks for Efficient Associative Memory Design
Dreyfus, Gérard, Guyon, Isabelle, Nadal, Jean-Pierre, Personnaz, Léon
The designed networks exhibit the desired associative memory function: perfect storage and retrieval of pieces of information and/or sequences of information of any complexity. INTRODUCTION In the field of information processing, an important class of potential applications of neural networks arises from their ability to perform as associative memories. Since the publication of J. Hopfield's seminal paper1, investigations of the storage and retrieval properties of recurrent networks have led to a deep understanding of their properties. The basic limitations of these networks are the following: - their storage capacity is of the order of the number of neurons; - they are unable to handle structured problems; - they are unable to classify non-linearly separable data. American Institute of Physics 1988 234 In order to circumvent these limitations, one has to introduce additional non-linearities. This can be done either by using "hidden", nonlinear units, or by considering multi-neuron interactions2. This paper presents learning rules for networks with multiple interactions, allowing the storage and retrieval, either of static pieces of information (autoassociative memory), or of temporal sequences (associative memory), while preventing an explosive growth of the number of synaptic coefficients. AUTOASSOCIATIVEMEMORY The problem that will be addressed in this paragraph is how to design an autoassociative memory with a recurrent (or feedback) neural network when the number p of prototypes is large as compared to the number n of neurons. We consider a network of n binary neurons, operating in a synchronous mode, with period t.
Phasor Neural Networks
ABSTRACT A novel network type is introduced which uses unit-length 2-vectors for local variables. As an example of its applications, associative memory nets are defined and their performance analyzed. Real systems corresponding to such'phasor' models can be e.g. INTRODUCTION Most neural network models use either binary local variables or scalars combined with sigmoidal nonlinearities. Rather awkward coding schemes have to be invoked if one wants to maintain linear relations between the local signals being processed in e.g.