Goto

Collaborating Authors

 Machine Learning


The Cascade-Correlation Learning Architecture

Neural Information Processing Systems

Cascade-Correlation is a new architecture and supervised learning algorithm forartificial neural networks. Instead of just adjusting the weights in a network of fixed topology. Cascade-Correlation begins with a minimal network,then automatically trains and adds new hidden units one by one, creating a multi-layer structure. Once a new hidden unit has been added to the network, its input-side weights are frozen. This unit then becomes a permanent feature-detector in the network, available for producing outputs or for creating other, more complex feature detectors. TheCascade-Correlation architecture has several advantages over existing algorithms: it learns very quickly, the network .determines



Designing Application-Specific Neural Networks Using the Genetic Algorithm

Neural Information Processing Systems

With the growing interest in the practical use of neural networks, addressing the problem of customiling networks for specific applications is becoming increasingly critical.It has repeatedly been observed that different network structures and learning parameters can substantially affect performance. Such important aspects of neural network applications as generalilation, learning speed, connectivity andtolerance to network damage are strongly related to the choice of 448 Harp, Samad and Guha network architecture. Yet there are few analytic results, and few heuristics, that can help the application developer design an appropriate network. We have been investigating the use of the genetic algorithm (Goldberg, 1989; Holland, 1975) for designing application-specific neural networks (Harp, Samad and Guha, 1989ab). In our approach, the genetic algorithm is used to evolve appropriate network structures and values of learning parameters.


Analysis of Linsker's Simulations of Hebbian Rules

Neural Information Processing Systems

Linsker has reported the development of centre---surround receptive fields and oriented receptive fields in simulations of a Hebb-type equation in a linear network. The dynamics of the learning rule are analysed in terms of the eigenvectors of the covariance matrix of cell activities. Analytic and computational results for Linsker's covariance matrices, and some general theorems, lead to an explanation ofthe emergence of centre---surround and certain oriented structures. Linsker [Linsker, 1986, Linsker, 1988] has studied by simulation the evolution of weight vectors under a Hebb-type teacherless learning rule in a feed-forward linear network. The equation for the evolution of the weight vector w of a single neuron, derived by ensemble averaging the Hebbian rule over the statistics of the input patterns, is:!



The Perceptron Algorithm Is Fast for Non-Malicious Distributions

Neural Information Processing Systems

Interest in this algorithm waned in the 1970's after it was emphasized[Minsky andPapert, 1969] (1) that the class of problems solvable by a single half space was limited, and (2) that the Perceptron algorithm, although converging infinite time, did not converge in polynomial time. In the 1980's, however, it has become evident that there is no hope of providing a learning algorithm which can learn arbitrary functions in polynomial time and much research has thus been restricted to algorithms which learn a function drawn from a particular class of functions. Moreover, learning theory has focused on protocols like that of [Valiant, 1984] where we seek to classify, not a fixed set of examples, but examples drawn from a probability distribution. This allows a natural notion of "generalization" . There are very few classes which have yet been proven learnable in polynomial time, and one of these is the class of half spaces. Thus there is considerable theoretical interest now in studying the problem of learning a single half space, and so it is natural to reexamine the Percept ron algorithm within the formalism of Valiant. The Perceptron Algorithm Is Fast for Non-Malicious Distributions 677 In Valiant's protocol, a class of functions is called learnable if there is a learning algorithm which works in polynomial time independent of the distribution D generating the examples. Under this definition the Perceptron learning algorithm is not a polynomial time learning algorithm. However we will argue in section 2 that this definition is too restrictive.


Learning in Higher-Order "Artificial Dendritic Trees

Neural Information Processing Systems

The computational territory between the linearly summing McCulloch-Pitts neuron and the nonlinear differential equations of Hodgkin & Huxley is relatively sparsely populated. Connectionistsuse variants of the former and computational neuroscientists struggle with the exploding parameter spaces provided by the latter. However, evidence frombiophysical simulations suggests that the voltage transfer properties of synapses, spines and dendritic membranes involve many detailed nonlinear interactions, notjust a squashing function at the cell body. Real neurons may indeed be higher-order nets. For the computationally-minded, higher order interactions means, first of all, quadratic terms.


Connectionist Architectures for Multi-Speaker Phoneme Recognition

Neural Information Processing Systems

We present a number of Time-Delay Neural Network (TDNN) based architectures for multi-speaker phoneme recognition (/b,d,g/ task). We use speech of two females and four males to compare the performance of the various architectures against a baseline recognition rate of 95.9% for a single IDNN on the six-speaker /b,d,g/ task.


Development and Regeneration of Eye-Brain Maps: A Computational Model

Neural Information Processing Systems

We outline a computational model of the development and regeneration ofspecific eye-brain circuits. The model comprises a self-organizing map-forming network which uses local Hebb rules.