VLSI Implementations of Learning and Memory Systems: A Review
ABSTRACT A large number of VLSI implementations of neural network models have been reported. The diversity of these implementations is noteworthy. This paper attempts to put a group of representative VLSI implementations in perspective by comparing and contrasting them. Design tradeoffs are discussed and some suggestions forthe direction of future implementation efforts are made. IMPLEMENTATION Changing the way information is represented can be beneficial.
Interaction Among Ocularity, Retinotopy and On-center/Off-center Pathways During Development
The development of projections from the retinas to the cortex is mathematically analyzed according to the previously proposed thermodynamic formulation of the self-organization of neural networks. Three types of submodality included in the visual afferent pathways are assumed in two models: model (A), in which the ocularity and retinotopy are considered separately, and model (B), in which on-center/off-center pathways are considered in addition to ocularity and retinotopy. Model (A) shows striped ocular dominance spatial patterns and, in ocular dominance histograms, reveals a dip in the binocular bin. Model (B) displays spatially modulated irregular patterns and shows single-peak behavior in the histograms. When we compare the simulated results with the observed results, it is evident that the ocular dominance spatial patterns and histograms for models (A) and (B) agree very closely with those seen in monkeys and cats.
Learning Theory and Experiments with Competitive Networks
Bilbro, Griff L., Bout, David E. van den
Raleigh, NC 27695-7914 Abstract We apply the theory of Tishby, Levin, and Sol1a (TLS) to two problems. First we analyze an elementary problem for which we find the predictions consistent with conventional statistical results. Second we numerically examine the more realistic problem of training a competitive net to learn a probability density from samples. We find TLS useful for predicting average training behavior.. 1 TLS APPLIED TO LEARNING DENSITIES Recently a theory of learning has been constructed which describes the learning of a relation from examples (Tishby, Levin, and Sol1a, 1989), (Schwarb, Samalan, Sol1a, and Denker, 1990). The original derivation relies on a statistical mechanics treatment of the probability of independent events in a system with a specified average value of an additive error function. The resulting theory is not restricted to learning relations and it is not essentially statistical mechanical.
Language Induction by Phase Transition in Dynamical Recognizers
A higher order recurrent neural network architecture learns to recognize and generate languages after being "trained" on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning process illustrates a new form of mechanical inference: Induction by phase transition. A small weight adjustment causes a "bifurcation" in the limit behavior of the network.
The Recurrent Cascade-Correlation Architecture
Recurrent Cascade-Correlation CRCC) is a recurrent version of the Cascade Correlation learning architecture of Fah I man and Lebiere [Fahlman, 1990]. RCC can learn from examples to map a sequence of inputs into a desired sequence of outputs. New hidden units with recurrent connections are added to the network as needed during training. In effect, the network builds up a finite-state machine tailored specifically for the current problem. RCC retains the advantages of Cascade-Correlation: fast learning, good generalization, automatic construction of a near-minimal multi-layered network, and incremental training. Initially the network contains only inputs, output units, and the connections between them.
Simulation of the Neocognitron on a CCD Parallel Processing Architecture
Chuang, Michael L., Chiang, Alice M.
The neocognitron is a neural network for pattern recognition and feature extraction. An analog CCD parallel processing architecture developed at Lincoln Laboratory is particularly well suited to the computational requirements of shared-weight networks such as the neocognitron, and implementation of the neocognitron using the CCD architecture was simulated. A modification to the neocognitron training procedure, which improves network performance under the limited arithmetic precision that would be imposed by the CCD architecture, is presented.
Convergence of a Neural Network Classifier
Baras, John S., LaVigna, Anthony
In this paper, we prove that the vectors in the LVQ learning algorithm converge. We do this by showing that the learning algorithm performs stochastic approximation. Convergence is then obtained by identifying the appropriate conditions on the learning rate and on the underlying statistics of the classification problem. We also present a modification to the learning algorithm which we argue results in convergence of the LVQ error to the Bayesian optimal error as the appropriate parameters become large.
Discrete Affine Wavelet Transforms For Anaylsis And Synthesis Of Feedfoward Neural Networks
Pati, Y. C., Krishnaprasad, P. S.
In this paper we show that discrete affine wavelet transforms can provide a tool for the analysis and synthesis of standard feedforward neural networks. It is shown that wavelet frames for L2(IR) can be constructed based upon sigmoids. The spatia-spectral localization property of wavelets can be exploited in defining the topology and determining the weights of a feedforward network. Training a network constructed using the synthesis procedure described here involves minimization of a convex cost functional and therefore avoids pitfalls inherent in standard backpropagation algorithms. Extension of these methods to L2(IRN) is also discussed.
Adaptive Range Coding
Rosen, Bruce E., Goodwin, James M., Vidal, Jacques J.
Determination of nearly optimalt or at least adequatet regions is left as an additional task that would require that the system dynamics be analyzedt which is not always possible. To address this problemt we move region boundaries adaptively t progressively altering the initial partitioning to a more appropriate representation with no need for a priori knowledge. Unlike previous work (Michiet 1968)t (Bartot 1983)t (Andersont 1982) which used fixed coderSt this approach produces adaptive coders that contract and expand regions/ranges. During adaptationt frequently active regions/ranges contractt reducing the number of situations in which they will be activated, and increasing the chances that neighboring regions will receive input instead. This class of self-organization is discussed in Kohonen (Kohonent 1984)t (Rittert 1986t 1988).