Language Induction by Phase Transition in Dynamical Recognizers

Neural Information Processing Systems

A higher order recurrent neural network architecture learns to recognize and generate languages after being "trained" on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning process illustrates a new form of mechanical inference: Induction by phase transition. A small weight adjustment causes a "bifurcation" in the limit behavior of the network.


The Recurrent Cascade-Correlation Architecture

Neural Information Processing Systems

Recurrent Cascade-Correlation CRCC) is a recurrent version of the Cascade Correlation learning architecture of Fah I man and Lebiere [Fahlman, 1990]. RCC can learn from examples to map a sequence of inputs into a desired sequence of outputs. New hidden units with recurrent connections are added to the network as needed during training. In effect, the network builds up a finite-state machine tailored specifically for the current problem. RCC retains the advantages of Cascade-Correlation: fast learning, good generalization, automatic construction of a near-minimal multi-layered network, and incremental training. Initially the network contains only inputs, output units, and the connections between them.


Simulation of the Neocognitron on a CCD Parallel Processing Architecture

Neural Information Processing Systems

The neocognitron is a neural network for pattern recognition and feature extraction. An analog CCD parallel processing architecture developed at Lincoln Laboratory is particularly well suited to the computational requirements of shared-weight networks such as the neocognitron, and implementation of the neocognitron using the CCD architecture was simulated. A modification to the neocognitron training procedure, which improves network performance under the limited arithmetic precision that would be imposed by the CCD architecture, is presented.


Convergence of a Neural Network Classifier

Neural Information Processing Systems

In this paper, we prove that the vectors in the LVQ learning algorithm converge. We do this by showing that the learning algorithm performs stochastic approximation. Convergence is then obtained by identifying the appropriate conditions on the learning rate and on the underlying statistics of the classification problem. We also present a modification to the learning algorithm which we argue results in convergence of the LVQ error to the Bayesian optimal error as the appropriate parameters become large.


Discrete Affine Wavelet Transforms For Anaylsis And Synthesis Of Feedfoward Neural Networks

Neural Information Processing Systems

In this paper we show that discrete affine wavelet transforms can provide a tool for the analysis and synthesis of standard feedforward neural networks. It is shown that wavelet frames for L2(IR) can be constructed based upon sigmoids. The spatia-spectral localization property of wavelets can be exploited in defining the topology and determining the weights of a feedforward network. Training a network constructed using the synthesis procedure described here involves minimization of a convex cost functional and therefore avoids pitfalls inherent in standard backpropagation algorithms. Extension of these methods to L2(IRN) is also discussed.


Adaptive Range Coding

Neural Information Processing Systems

Determination of nearly optimalt or at least adequatet regions is left as an additional task that would require that the system dynamics be analyzedt which is not always possible. To address this problemt we move region boundaries adaptively t progressively altering the initial partitioning to a more appropriate representation with no need for a priori knowledge. Unlike previous work (Michiet 1968)t (Bartot 1983)t (Andersont 1982) which used fixed coderSt this approach produces adaptive coders that contract and expand regions/ranges. During adaptationt frequently active regions/ranges contractt reducing the number of situations in which they will be activated, and increasing the chances that neighboring regions will receive input instead. This class of self-organization is discussed in Kohonen (Kohonent 1984)t (Rittert 1986t 1988).


Phase-coupling in Two-Dimensional Networks of Interacting Oscillators

Neural Information Processing Systems

Coherent oscillatory activity in large networks of biological or artificial neural units may be a useful mechanism for coding information pertaining to a single perceptual object or for detailing regularities within a data set. We consider the dynamics of a large array of simple coupled oscillators under a variety of connection schemes. Of particular interest is the rapid and robust phase-locking that results from a "sparse" scheme where each oscillator is strongly coupled to a tiny, randomly selected, subset of its neighbors.



Design and Implementation of a High Speed CMAC Neural Network Using Programmable CMOS Logic Cell Arrays

Neural Information Processing Systems

A high speed implementation of the CMAC neural network was designed using dedicated CMOS logic. This technology was then used to implement two general purpose CMAC associative memory boards for the VME bus. Each board implements up to 8 independent CMAC networks with a total of one million adjustable weights. Each CMAC network can be configured to have from 1 to 512 integer inputs and from 1 to 8 integer outputs. Response times for typical CMAC networks are well below 1 millisecond, making the networks sufficiently fast for most robot control problems, and many pattern recognition and signal processing problems.


A Comparative Study of the Practical Characteristics of Neural Network and Conventional Pattern Classifiers

Neural Information Processing Systems

Seven different neural network and conventional pattern classifiers were compared using artificial and speech recognition tasks. High order polynomial GMDH classifiers typically provided intermediate error rates and often required long training times and large amounts of memory. In addition, the decision regions formed did not generalize well to regions of the input space with little training data. Radial basis function classifiers generalized well in high dimensional spaces, and provided low error rates with training times that were much less than those of back-propagation classifiers (Lee and Lippmann, 1989). Gaussian mixture classifiers provided good performance when the numbers and types of mixtures were selected carefully to model class densities well. Linear tree classifiers were the most computationally ef- 976 Ng and Lippmann ficient but performed poorly with high dimensionality inputs and when the number of training patterns was small. KD-tree classifiers reduced classification time by a factor of four over conventional KNN classifiers for low 2-input dimension problems. They provided little or no reduction in classification time for high 22-input dimension problems. Improved condensed KNN classifiers reduced memory requirements over conventional KNN classifiers by a factor of two to fifteen for all problems, without increasing the error rate significantly.