Goto

Collaborating Authors

 Perceptrons


A Novel Channel Selection System in Cochlear Implants Using Artificial Neural Network

Neural Information Processing Systems

State-of-the-art speech processors in cochlear implants perform channel selection using a spectral maxima strategy. This strategy can lead to confusions when high frequency features are needed to discriminate between sounds. We present in this paper a novel channel selection strategy based upon pattern recognition which al(cid:173) lows "smart" channel selections to be made. The proposed strategy is implemented using multi-layer perceptrons trained on a multi(cid:173) speaker labelled speech database. The input to the network are the energy coefficients of N energy channels.


On the Computational Power of Noisy Spiking Neurons

Neural Information Processing Systems

It has remained unknown whether one can in principle carry out reliable digital computations with networks of biologically realistic models for neurons. This article presents rigorous constructions for simulating in real-time arbitrary given boolean circuits and fi(cid:173) nite automata with arbitrarily high reliability by networks of noisy spiking neurons. In addition we show that with the help of "shunting inhibition" even networks of very unreliable spiking neurons can simulate in real-time any McCulloch-Pitts neuron (or "threshold gate"), and therefore any multilayer perceptron (or "threshold circuit") in a reliable manner. These constructions provide a possible explana(cid:173) tion for the fact that biological neural systems can carry out quite complex computations within 100 msec. It turns out that the assumption that these constructions require about the shape of the EPSP's and the behaviour of the noise are surprisingly weak.


The Gamma MLP for Speech Phoneme Recognition

Neural Information Processing Systems

We define a Gamma multi-layer perceptron (MLP) as an MLP with the usual synaptic weights replaced by gamma filters (as pro(cid:173) posed by de Vries and Principe (de Vries and Principe, 1992)) and associated gain terms throughout all layers. We derive gradient descent update equations and apply the model to the recognition of speech phonemes. We find that both the inclusion of gamma filters in all layers, and the inclusion of synaptic gains, improves the performance of the Gamma MLP. We compare the Gamma MLP with TDNN, Back-Tsoi FIR MLP, and Back-Tsoi I1R MLP architectures, and a local approximation scheme. We find that the Gamma MLP results in an substantial reduction in error rates.


488 Solutions to the XOR Problem

Neural Information Processing Systems

A globally convergent homotopy method is defined that is capable of sequentially producing large numbers of stationary points of the multi-layer perceptron mean-squared error surface. Using this al(cid:173) gorithm large subsets of the stationary points of two test problems are found. It is shown empirically that the MLP neural network appears to have an extreme ratio of saddle points compared to local minima, and that even small neural network problems have extremely large numbers of solutions.


A Mean Field Algorithm for Bayes Learning in Large Feed-forward Neural Networks

Neural Information Processing Systems

We present an algorithm which is expected to realise Bayes optimal predictions in large feed-forward networks. It is based on mean field methods developed within statistical mechanics of disordered sys(cid:173) tems. We give a derivation for the single layer perceptron and show that the algorithm also provides a leave-one-out cross-validation test of the predictions.


The Efficiency and the Robustness of Natural Gradient Descent Learning Rule

Neural Information Processing Systems

The inverse of the Fisher information matrix is used in the natu(cid:173) ral gradient descent algorithm to train single-layer and multi-layer perceptrons. We have discovered a new scheme to represent the Fisher information matrix of a stochastic multi-layer perceptron. Based on this scheme, we have designed an algorithm to compute the natural gradient. When the input dimension n is much larger than the number of hidden neurons, the complexity of this algo(cid:173) rithm is of order O(n). It is confirmed by simulations that the natural gradient descent learning rule is not only efficient but also robust.


Analytical Study of the Interplay between Architecture and Predictability

Neural Information Processing Systems

We study model feed forward networks as time series predictors in the stationary limit. The focus is on complex, yet non-chaotic, behavior. The main question we address is whether the asymptotic behavior is governed by the architecture, regardless the details of the weights . We find hierarchies among classes of architectures with respect to the attract or dimension of the long term sequence they are capable of generating; larger number of hidden units can generate higher dimensional attractors. In the case of a perceptron, we develop the stationary solution for general weights, and show that the flow is typically one dimensional.


Data-Dependent Structural Risk Minimization for Perceptron Decision Trees

Neural Information Processing Systems

A novel neural network model of pre-attention processing in visual(cid:173) search tasks is presented. Using displays of line orientations taken from Wolfe's experiments [1992], we study the hypothesis that the distinction between parallel versus serial processes arises from the availability of global information in the internal representations of the visual scene. The model operates in two phases. First, the visual displays are compressed via principal-component-analysis. Second, the compressed data is processed by a target detector mod(cid:173) ule in order to identify the existence of a target in the display.


Radial Basis Functions: A Bayesian Treatment

Neural Information Processing Systems

Bayesian methods have been successfully applied to regression and classification problems in multi-layer perceptrons. We present a novel application of Bayesian techniques to Radial Basis Function networks by developing a Gaussian approximation to the posterior distribution which, for fixed basis function widths, is analytic in the parameters. The setting of regularization constants by cross(cid:173) validation is wasteful as only a single optimal parameter estimate is retained. We treat this issue by assigning prior distributions to these constants, which are then adapted in light of the data under a simple re-estimation formula.


Use of a Multi-Layer Perceptron to Predict Malignancy in Ovarian Tumors

Neural Information Processing Systems

The objective of RL is to find -thanks to a reinforcement signal- an optimal strategy for solving a dynamical control problem. Here we sudy the continuous time, con(cid:173) tinuous state-space stochastic case, which covers a wide variety of control problems including target, viability, optimization problems (see [FS93], [KP95])}or which a formalism is the following.