Goto

Collaborating Authors

 Perceptrons


Principled Architecture Selection for Neural Networks: Application to Corporate Bond Rating Prediction

Neural Information Processing Systems

The notion of generalization ability can be defined precisely as the pre(cid:173) diction risk, the expected performance of an estimator in predicting new observations. In this paper, we propose the prediction risk as a measure of the generalization ability of multi-layer perceptron networks and use it to select an optimal network architecture from a set of possible architec(cid:173) tures. We also propose a heuristic search strategy to explore the space of possible architectures. The prediction risk is estimated from the available data; here we estimate the prediction risk by v-fold cross-validation and by asymptotic approximations of generalized cross-validation or Akaike's final prediction error. We apply the technique to the problem of predicting corporate bond ratings.


Context-Dependent Multiple Distribution Phonetic Modeling with MLPs

Neural Information Processing Systems

A number of hybrid multilayer perceptron (MLP)/hidden Markov model (HMM:) speech recognition systems have been developed in recent years (Morgan and Bourlard. The new training procedure smooths MLPs trained at different degrees of context dependence in order to obtain a robust estimate of the cootext-dependent probabilities. Tests with the DARPA Resomce Management database have shown substantial advantages of the context-dependent MLPs over earlier cootext(cid:173) independent MLPs.


Some Estimates of Necessary Number of Connections and Hidden Units for Feed-Forward Networks

Neural Information Processing Systems

The feed-forward networks with fixed hidden units (FllU-networks) are compared against the category of remaining feed-forward net(cid:173) works with variable hidden units (VHU-networks). Two broad classes of tasks on a finite domain X C R n are considered: ap(cid:173) proximation of every function from an open subset of functions on X and representation of every dichotomy of X. For the first task it is found that both network categories require the same minimal number of synaptic weights. For the second task and X in gen(cid:173) eral position it is shown that VHU-networks with threshold logic hidden units can have approximately lin times fewer hidden units than any FHU-network must have.


Input Reconstruction Reliability Estimation

Neural Information Processing Systems

This paper describes a technique called Input Reconstruction Reliability Estimation (IRRE) for determining the response reliability of a restricted class of multi-layer perceptrons (MLPs). The technique uses a network's ability to accurately encode the input pattern in its internal representation as a measure of its reliability. The more accurately a network is able to reconstruct the input pattern from its internal representation, the more reliable the network is considered to be. IRRE is provides a good estimate of the reliability of MLPs trained for autonomous driving. Results are presented in which the reliability estimates provided by IRRE are used to select between networks trained for different driving situations.


Reinforcement Learning Applied to Linear Quadratic Regulation

Neural Information Processing Systems

Recent research on reinforcement learning has focused on algo(cid:173) rithms based on the principles of Dynamic Programming (DP). One of the most promising areas of application for these algo(cid:173) rithms is the control of dynamical systems, and some impressive results have been achieved. However, there are significant gaps between practice and theory. In particular, there are no con ver(cid:173) gence proofs for problems with continuous state and action spaces, or for systems involving non-linear function approximators (such as multilayer perceptrons). This paper presents research applying DP-based reinforcement learning theory to Linear Quadratic Reg(cid:173) ulation (LQR), an important class of control problems involving continuous state and action spaces and requiring a simple type of non-linear function approximator. We describe an algorithm based on Q-Iearning that is proven to converge to the optimal controller for a large class of LQR problems.


Synaptic Weight Noise During MLP Learning Enhances Fault-Tolerance, Generalization and Learning Trajectory

Neural Information Processing Systems

We analyse the effects of analog noise on the synaptic arithmetic during MultiLayer Perceptron training, by expanding the cost func(cid:173) tion to include noise-mediated penalty terms. Predictions are made in the light of these calculations which suggest that fault tolerance, generalisation ability and learning trajectory should be improved by such noise-injection. Extensive simulation experiments on two distinct classification problems substantiate the claims. The re(cid:173) sults appear to be perfectly general for all training schemes where weights are adjusted incrementally, and have wide-ranging implica(cid:173) tions for all applications, particularly those involving "inaccurate" analog neural VLSI.


Classification of Electroencephalogram using Artificial Neural Networks

Neural Information Processing Systems

In this paper, we will consider the problem of classifying electroencephalo(cid:173) gram (EEG) signals of normal subjects, and subjects suffering from psychi(cid:173) atric disorder, e.g., obsessive compulsive disorder, schizophrenia, using a class of artificial neural networks, viz., multi-layer perceptron. It is shown that the multilayer perceptron is capable of classifying unseen test EEG signals to a high degree of accuracy.


Learning Temporal Dependencies in Connectionist Speech Recognition

Neural Information Processing Systems

In this paper, we discuss the nature of the time dependence currently employed in our systems using recurrent networks (RNs) and feed-forward multi-layer perceptrons (MLPs). In particular, we introduce local recurrences into a MLP to produce an enhanced input representation. This is in the form of an adaptive gamma filter and incorporates an automatic approach for learning temporal dependencies. We have experimented on a speaker(cid:173) independent phone recognition task using the TIMIT database. Results using the gamma filtered input representation have shown improvement over the baseline MLP system.


A Comparative Study of a Modified Bumptree Neural Network with Radial Basis Function Networks and the Standard Multi Layer Perceptron

Neural Information Processing Systems

Bumptrees are geometric data structures introduced by Omohundro (1991) to provide efficient access to a collection of functions on a Euclidean space of interest. We describe a modified bumptree structure that has been employed as a neural network classifier, and compare its performance on several classification tasks against that of radial basis function networks and the standard mutIi-Iayer perceptron.


Identifying Fault-Prone Software Modules Using Feed-Forward Networks: A Case Study

Neural Information Processing Systems

Functional complexity of a software module can be measured in terms of static complexity metrics of the program text. Classify(cid:173) ing software modules, based on their static complexity measures, into different fault-prone categories is a difficult problem in soft(cid:173) ware engineering. This research investigates the applicability of neural network classifiers for identifying fault-prone software mod(cid:173) ules using a data set from a commercial software system. A pre(cid:173) liminary empirical comparison is performed between a minimum distance based Gaussian classifier, a perceptron classifier and a multilayer layer feed-forward network classifier constructed using a modified Cascade-Correlation algorithm. The modified version of the Cascade-Correlation algorithm constrains the growth of the network size by incorporating a cross-validation check during the output layer training phase.