North America
Node Splitting: A Constructive Algorithm for Feed-Forward Neural Networks
The small network forms an approximate model of a set of training data, and the split creates a larger more powerful network which is initialised with the approximate solution already found. The insufficiency of the smaller network in modelling the system which generated the data leads to oscillation in those hidden nodes whose weight vectors cover regions inthe input space where more detail is required in the model. These nodes are identified and split in two using principal component analysis, allowing the new nodes t.o cover the two main modes of each oscillating vector. Nodes are selected for splitting using principal component analysis on the oscillating weight vectors, or by examining the Hessian matrix of second derivatives of the network error with respect to the weight.s.
Learning How to Teach or Selecting Minimal Surface Data
Geiger, Davi, Pereira, Ricardo A. Marques
Marques Pereira Dipartimento di Informatica Universita di Trento Via Inama 7, Trento, TN 38100 ITALY Abstract Learning a map from an input set to an output set is similar to the problem ofreconstructing hypersurfaces from sparse data (Poggio and Girosi, 1990). In this framework, we discuss the problem of automatically selecting "minimal"surface data. The objective is to be able to approximately reconstruct the surface from the selected sparse data. We show that this problem is equivalent to the one of compressing information by data removal andthe one oflearning how to teach. Our key step is to introduce a process that statistically selects the data according to the model.
Hierarchical Transformation of Space in the Visual System
Pouget, Alexandre, Fisher, Stephen A., Sejnowski, Terrence J.
Neurons encoding simple visual features in area VI such as orientation, direction of motion and color are organized in retinotopic maps. However, recentphysiological experiments have shown that the responses of many neurons in VI and other cortical areas are modulated by the direction ofgaze. We have developed a neural network model of the visual cortex to explore the hypothesis that visual features are encoded in headcentered coordinatesat early stages of visual processing. New experiments are suggested for testing this hypothesis using electrical stimulations and psychophysical observations.
Connectionist Optimisation of Tied Mixture Hidden Markov Models
Renals, Steve, Morgan, Nelson, Bourlard, Hervรฉ, Franco, Horacio, Cohen, Michael
Horacio Franco Michael Cohen SRI International Menlo Park CA 94025 USA Issues relating to the estimation of hidden Markov model (HMM) local probabilities are discussed. In particular we note the isomorphism of radial basisfunctions (RBF) networks to tied mixture density modellingj additionally we highlight the differences between these methods arising from the different training criteria employed. We present a method in which connectionist training can be modified to resolve these differences and discuss some preliminary experiments. Finally, we discuss some outstanding problemswith discriminative training.
Improving the Performance of Radial Basis Function Networks by Learning Center Locations
Wettschereck, Dietrich, Dietterich, Thomas
Three methods for improving the performance of (gaussian) radial basis function (RBF) networks were tested on the NETtaik task. In RBF, a new example is classified by computing its Euclidean distance to a set of centers chosen by unsupervised methods. The application of supervised learning to learn a non-Euclidean distance metric was found to reduce the error rate of RBF networks, while supervised learning of each center's variance resultedin inferior performance. The best improvement in accuracy was achieved by networks called generalized radial basis function (GRBF) networks. In GRBF, the center locations are determined by supervised learning. After training on 1000 words, RBF classifies 56.5% of letters correct, while GRBF scores 73.4% letters correct (on a separate test set). From these and other experiments, we conclude that supervised learning of center locations can be very important for radial basis function learning.
Recurrent Networks and NARMA Modeling
Connor, Jerome, Atlas, Les E., Martin, Douglas R.
There exist large classes of time series, such as those with nonlinear moving average components, that are not well modeled by feedforward networks or linear models, but can be modeled by recurrent networks. We show that recurrent neural networks are a type of nonlinear autoregressive-moving average (N ARMA) model. Practical ability will be shown in the results of a competition sponsored by the Puget Sound Power and Light Company, where the recurrent networks gave the best performance on electric load forecasting. 1 Introduction This paper will concentrate on identifying types of time series for which a recurrent network provides a significantly better model, and corresponding prediction, than a feedforward network. Our main interest is in discrete time series that are parsimoniously modeledby a simple recurrent network, but for which, a feedforward neural network is highly non-parsimonious by virtue of requiring an infinite amount of past observations as input to achieve the same accuracy in prediction. Our approach is to consider predictive neural networks as stochastic models.