Goto

Collaborating Authors

 Country



Towards an Organizing Principle for a Layered Perceptual Network

Neural Information Processing Systems

TOWARDS AN ORGANIZING PRINCIPLE FOR A LAYERED PERCEPTUAL NETWORK Ralph Linsker IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598 Abstract An information-theoretic optimization principle is proposed for the development of each processing stage of a multilayered perceptual network. This principle of "maximum information preservation" states that the signal transformation that is to be realized at each stage is one that maximizes the information that the output signal values (from that stage) convey about the input signals values (to that stage), subject to certain constraints and in the presence of processing noise. The quantity being maximized is a Shannon information rate. I provide motivation for this principle and -- for some simple model cases -- derive some of its consequences, discuss an algorithmic implementation, and show how the principle may lead to biologically relevant neural architectural features such as topographic maps, map distortions, orientation selectivity, and extraction of spatial and temporal signal correlations. A possible connection between this information-theoretic principle and a principle of minimum entropy production in nonequilibrium thermodynamics is suggested. Introduction This paper describes some properties of a proposed information-theoretic organizing principle for the development of a layered perceptual network.


Performance Measures for Associative Memories that Learn and Forget

Neural Information Processing Systems

Recently, many modifications to the McCulloch/Pitts model have been proposed where both learning and forgetting occur. Given that the network never saturates (ceases to function effectively due to an overload of information), the learning updates can continue indefinitely. For these networks, we need to introduce performance measmes in addition to the information capacity to evaluate the different networks. We mathematically define quantities such as the plasticity of a network, the efficacy of an information vector, and the probability of network saturation. From these quantities we analytically compare different networks.



Capacity for Patterns and Sequences in Kanerva's SDM as Compared to Other Associative Memory Models

Neural Information Processing Systems

ABSTRACT The information capacity of Kanerva's Sparse, Distributed Memory (SDM) and Hopfield-type neural networks is investigated. Under the approximations used here, it is shown that the total information stored in these systems is proportional to the number connections in the network. The proportionality constant is the same for the SDM and HopJreld-type models independent of the particular model, or the order of the model. The approximations are checked numerically. This same analysis can be used to show that the SDM can store sequences of spatiotemporal patterns, and the addition of time-delayed connections allows the retrieval of context dependent temporal patterns. A minor modification of the SDM can be used to store correlated patterns. INTRODUCTION Many different models of memory and thought have been proposed by scientists over the years. The learning rule considered here uses the outer-product of patterns of Is and -Is.


An Optimization Network for Matrix Inversion

Neural Information Processing Systems

Box 150, Cheongryang, Seoul, Korea ABSTRACT Inverse matrix calculation can be considered as an optimization. We have demonstrated that this problem can be rapidly solved by highly interconnected simple neuron-like analog processors. A network for matrix inversion based on the concept of Hopfield's neural network was designed, and implemented with electronic hardware. With slight modifications, the network is readily applicable to solving a linear simultaneous equation efficiently. Notable features of this circuit are potential speed due to parallel processing, and robustness against variations of device parameters.


Neural Net and Traditional Classifiers

Neural Information Processing Systems

Previous work on nets with continuous-valued inputs led to generative procedures to construct convex decision regions with two-layer perceptrons (one hidden layer) and arbitrary decision regions with three-layer perceptrons (two hidden layers). Here we demonstrate that two-layer perceptron classifiers trained with back propagation can form both convex and disjoint decision regions. Such classifiers are robust, train rapidly, and provide good performance with simple decision regions. When complex decision regions are required, however, convergence time can be excessively long and performance is often no better than that of k-nearest neighbor classifiers. Three neural net classifiers are presented that provide more rapid training under such situations. Two use fixed weights in the first one or two layers and are similar to classifiers that estimate probability density functions using histograms. A third "feature map classifier" uses both unsupervised and supervised training. It provides good performance with little supervised training in situations such as speech recognition where much unlabeled training data is available. The architecture of this classifier can be used to implement a neural net k-nearest neighbor classifier.


Experimental Demonstrations of Optical Neural Computers

Neural Information Processing Systems

The high interconnectivity required by neural computers can be simply implemented in optics because channels for optical signals may be superimposed in three dimensions with little or no cross coupling. Since these channels may be formed holographically, optical neural systems can be designed to create and maintain interconnections very simply. Thus the optical system designer can to a large extent avoid the analytical and topological problems of determining individual interconnections for a given neural system and constructing physical paths for these interconnections. An archetypical design for a single layer of an optical neural computer is shown in Figure 1. Nonlinear thresholding elements, neurons, are arranged on two dimensional planes which are interconnected via the third dimension by holographic elements. The key concerns in implementing this design involve the need for suitable nonlinearities for the neural planes and high capacity, easily modifiable holographic elements. While it is possible to implement the neural function using entirely optical nonlinearities, for example using etalon arrays\ optoelectronic two dimensional spatial light modulators (2D SLMs) suitable for this purpose are more readily available.


Minkowski-r Back-Propagation: Learning in Connectionist Models with Non-Euclidian Error Signals

Neural Information Processing Systems

It can be shown that neural-like networks containing a single hidden layer of nonlinear activation units can learn to do a piece-wise linear partitioning of a feature space [2]. One result of such a partitioning is a complex gradient surface on which decisions about new input stimuli will be made. The generalization, categorization and clustering propenies of the network are therefore detennined by this mapping of input stimuli to this gradient swface in the output space. This gradient swface is a function of the conditional probability distributions of the output vectors given the input feature vectors as well as a function of the error relating the teacher signal and output.


The Connectivity Analysis of Simple Association

Neural Information Processing Systems

The Connectivity Analysis of Simple Association - or-How Many Connections Do You Need! Oregon Graduate Center, Beaverton, OR 97006 ABSTRACT The efficient realization, using current silicon technology, of Very Large Connection Networks (VLCN) with more than a billion connections requires that these networks exhibit a high degree of communication locality. Real neural networks exhibit significant locality, yet most connectionist/neural network models have little. In this paper, the connectivity requirements of a simple associative network are analyzed using communication theory. Several techniques based on communication theory are presented that improve the robustness of the network in the face of sparse, local interconnect structures. Also discussed are some potential problems when information is distributed too widely. INTRODUCTION Connectionist/neural network researchers are learning to program networks that exhibit a broad range of cognitive behavior.