Goto

Collaborating Authors

 Technology


A Lagrangian Formulation For Optical Backpropagation Training In Kerr-Type Optical Networks

Neural Information Processing Systems

A training method based on a form of continuous spatially distributed optical error back-propagation is presented for an all optical network composed of nondiscrete neurons and weighted interconnections. The all optical network is feed-forward and is composed of thin layers of a Kerrtype self focusing/defocusing nonlinear optical material. The training method is derived from a Lagrangian formulation of the constrained minimization of the network error at the output. This leads to a formulation that describes training as a calculation of the distributed error of the optical signal at the output which is then reflected back through the device to assign a spatially distributed error to the internal layers. This error is then used to modify the internal weighting values. Results from several computer simulations of the training are presented, and a simple optical table demonstration of the network is discussed.


Pulsestream Synapses with Non-Volatile Analogue Amorphous-Silicon Memories

Neural Information Processing Systems

This paper presents results from the first use of neural networks for the real-time feedback control of high temperature plasmas in a tokamak fusion experiment. The tokamak is currently the principal experimental device for research into the magnetic confinement approach to controlled fusion. In the tokamak, hydrogen plasmas, at temperatures of up to 100 Million K, are confined by strong magnetic fields. Accurate control of the position and shape of the plasma boundary requires real-time feedback control of the magnetic field structure on a timescale of a few tens of microseconds. Software simulations have demonstrated that a neural network approach can give significantly better performance than the linear technique currently used on most tokamak experiments. The practical application of the neural network approach requires high-speed hardware, for which a fully parallel implementation of the multilayer perceptron, using a hybrid of digital and analogue technology, has been developed.



ICEG Morphology Classification using an Analogue VLSI Neural Network

Neural Information Processing Systems

An analogue VLSI neural network has been designed and tested to perform cardiac morphology classification tasks. Analogue techniques were chosen to meet the strict power and area requirements of an Implantable Cardioverter Defibrillator (ICD) system. The robustness of the neural network architecture reduces the impact of noise, drift and offsets inherent in analogue approaches. The network is a 10:6:3 multi-layer percept ron with on chip digital weight storage, a bucket brigade input to feed the Intracardiac Electrogram (ICEG) to the network and has a winner take all circuit at the output. The network was trained in loop and included a commercial ICD in the signal processing path. The system has successfully distinguished arrhythmia for different patients with better than 90% true positive and true negative detections for dangerous rhythms which cannot be detected by present ICDs. The chip was implemented in 1.2um CMOS and consumes less than 200n W maximum average power in an area of 2.2 x 2.2mm2.


Active Learning with Statistical Models

Neural Information Processing Systems

For many types of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992; Cohn, 1994]. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.


An experimental comparison of recurrent neural networks

Neural Information Processing Systems

Many different discrete-time recurrent neural network architectures have been proposed. However, there has been virtually no effort to compare these arch:tectures experimentally. In this paper we review and categorize many of these architectures and compare how they perform on various classes of simple problems including grammatical inference and nonlinear system identification.


Efficient Methods for Dealing with Missing Data in Supervised Learning

Neural Information Processing Systems

In many applications it is important to know how to react if the available information is incomplete, if sensors fail or if sources of information become A.t the time of the research for this paper, a visiting researcher at the Center for Biological and Computational Learning, MIT.


Classifying with Gaussian Mixtures and Clusters

Neural Information Processing Systems

In this paper, we derive classifiers which are winner-take-all (WTA) approximations to a Bayes classifier with Gaussian mixtures for class conditional densities. The derived classifiers include clustering based algorithms like LVQ and k-Means. We propose a constrained rank Gaussian mixtures model and derive a WTA algorithm for it. Our experiments with two speech classification tasks indicate that the constrained rank model and the WTA approximations improve the performance over the unconstrained models. 1 Introduction A classifier assigns vectors from Rn (n dimensional feature space) to one of K classes, partitioning the feature space into a set of K disjoint regions. A Bayesian classifier builds the partition based on a model of the class conditional probability densities of the inputs (the partition is optimal for the given model).


Recurrent Networks: Second Order Properties and Pruning

Neural Information Processing Systems

Second order properties of cost functions for recurrent networks are investigated. We analyze a layered fully recurrent architecture, the virtue of this architecture is that it features the conventional feedforward architecture as a special case. A detailed description of recursive computation of the full Hessian of the network cost function is provided. We discuss the possibility of invoking simplifying approximations of the Hessian and show how weight decays iron the cost function and thereby greatly assist training. We present tentative pruning results, using Hassibi et al.'s Optimal Brain Surgeon, demonstrating that recurrent networks can construct an efficient internal memory. 1 LEARNING IN RECURRENT NETWORKS Time series processing is an important application area for neural networks and numerous architectures have been suggested, see e.g. (Weigend and Gershenfeld, 94). The most general structure is a fully recurrent network and it may be adapted using Real Time Recurrent Learning (RTRL) suggested by (Williams and Zipser, 89). By invoking a recurrent network, the length of the network memory can be adapted to the given time series, while it is fixed for the conventional lag-space net (Weigend et al., 90). In forecasting, however, feedforward architectures remain the most popular structures; only few applications are reported based on the Williams&Zipser approach.


A Rapid Graph-based Method for Arbitrary Transformation-Invariant Pattern Classification

Neural Information Processing Systems

We present a graph-based method for rapid, accurate search through prototypes for transformation-invariant pattern classification. Our method has in theory the same recognition accuracy as other recent methods based on ''tangent distance" [Simard et al., 1994], since it uses the same categorization rule. Nevertheless ours is significantly faster during classification because far fewer tangent distances need be computed. Crucial to the success of our system are 1) a novel graph architecture in which transformation constraints and geometric relationships among prototypes are encoded during learning, and 2) an improved graph search criterion, used during classification. These architectural insights are applicable to a wide range of problem domains. Here we demonstrate that on a handwriting recognition task, a basic implementation of our system requires less than half the computation of the Euclidean sorting method. 1 INTRODUCTION In recent years, the crucial issue of incorporating invariances into networks for pattern recognition has received increased attention, most especially due to the work of 666 Alessandro Sperduti, David G. Stork