Goto

Collaborating Authors

 Perceptrons


Phonetic Classification and Recognition Using the Multi-Layer Perceptron

Neural Information Processing Systems

In this paper, we will describe several extensions to our earlier work, utilizing asegment-based approach. We will formulate our segmental framework and report our study on the use of multi-layer perceptrons for detection and classification of phonemes. We will also examine the outputs of the network, and compare the network performance with other classifiers. Our investigation is performed within a set of experiments that attempts to recognize 38 vowels and consonants in American English independent of speaker.


A Cost Function for Internal Representations

Neural Information Processing Systems

We introduce a cost function for learning in feed-forward neural networks which is an explicit function of the internal representation in addition to the weights. The learning problem can then be formulated as two simple perceptrons and a search for internal representations. Back-propagation is recovered as a limit. The frequency of successful solutions is better for this algorithm than for back-propagation when weights and hidden units are updated on the same timescale i.e. once every learning step. 1 INTRODUCTION In their review of back-propagation in layered networks, Rumelhart et al. (1986) describe the learning process in terms of finding good "internal representations" of the input patterns on the hidden units. However, the search for these representations is an indirect one, since the variables which are adjusted in its course are the connection weights, not the activations of the hidden units themselves when specific input patterns are fed into the input layer. Rather, the internal representations are represented implicitly in the connection weight values. More recently, Grossman et al. (1988 and 1989)1 suggested a way in which the search for internal representations could be made much more explicit.


The Perceptron Algorithm Is Fast for Non-Malicious Distributions

Neural Information Processing Systems

Interest in this algorithm waned in the 1970's after it was emphasized[Minsky and Papert, 1969] (1) that the class of problems solvable by a single half space was limited, and (2) that the Perceptron algorithm, although converging in finite time, did not converge in polynomial time. In the 1980's, however, it has become evident that there is no hope of providing a learning algorithm which can learn arbitrary functions in polynomial time and much research has thus been restricted to algorithms which learn a function drawn from a particular class of functions. Moreover, learning theory has focused on protocols like that of [Valiant, 1984] where we seek to classify, not a fixed set of examples, but examples drawn from a probability distribution. This allows a natural notion of "generalization". There are very few classes which have yet been proven learnable in polynomial time, and one of these is the class of half spaces. Thus there is considerable theoretical interest now in studying the problem of learning a single half space, and so it is natural to reexamine the Percept ron algorithm within the formalism of Valiant.



A Cost Function for Internal Representations

Neural Information Processing Systems

We introduce a cost function for learning in feed-forward neural networks which is an explicit function of the internal representation in addition to the weights. The learning problem can then be formulated as two simple perceptrons and a search for internal representations. Back-propagation is recovered as a limit. The frequency of successful solutions is better for this algorithm than for back-propagation when weights and hidden units are updated on the same timescale i.e. once every learning step. 1 INTRODUCTION In their review of back-propagation in layered networks, Rumelhart et al. (1986) describe the learning process in terms of finding good "internal representations" of the input patterns on the hidden units. However, the search for these representations is an indirect one, since the variables which are adjusted in its course are the connection weights, not the activations of the hidden units themselves when specific input patterns are fed into the input layer. Rather, the internal representations are represented implicitly in the connection weight values. More recently, Grossman et al. (1988 and 1989)1 suggested a way in which the search for internal representations could be made much more explicit.


The Perceptron Algorithm Is Fast for Non-Malicious Distributions

Neural Information Processing Systems

Interest in this algorithm waned in the 1970's after it was emphasized[Minsky and Papert, 1969] (1) that the class of problems solvable by a single half space was limited, and (2) that the Perceptron algorithm, although converging in finite time, did not converge in polynomial time. In the 1980's, however, it has become evident that there is no hope of providing a learning algorithm which can learn arbitrary functions in polynomial time and much research has thus been restricted to algorithms which learn a function drawn from a particular class of functions. Moreover, learning theory has focused on protocols like that of [Valiant, 1984] where we seek to classify, not a fixed set of examples, but examples drawn from a probability distribution. This allows a natural notion of "generalization". There are very few classes which have yet been proven learnable in polynomial time, and one of these is the class of half spaces. Thus there is considerable theoretical interest now in studying the problem of learning a single half space, and so it is natural to reexamine the Percept ron algorithm within the formalism of Valiant.


The CHIR Algorithm for Feed Forward Networks with Binary Weights

Neural Information Processing Systems

A new learning algorithm, Learning by Choice of Internal Represetations (CHIR), was recently introduced. Whereas many algorithms reduce the learning process to minimizing a cost function over the weights, our method treats the internal representations as the fundamental entities to be determined. The algorithm applies a search procedure in the space of internal representations, and a cooperative adaptation of the weights (e.g. by using the perceptron learning rule). Since the introduction of its basic, single output version, the CHIR algorithm was generalized to train any feed forward network of binary neurons. Here we present the generalised version of the CHIR algorithm, and further demonstrate its versatility by describing how it can be modified in order to train networks with binary ( 1) weights. Preliminary tests of this binary version on the random teacher problem are also reported.



A Cost Function for Internal Representations

Neural Information Processing Systems

We introduce a cost function for learning in feed-forward neural networks which is an explicit function of the internal representation inaddition to the weights. The learning problem can then be formulated as two simple perceptrons and a search for internal representations. Back-propagation is recovered as a limit. The frequency of successful solutions is better for this algorithm than for back-propagation when weights and hidden units are updated on the same timescale i.e. once every learning step. 1 INTRODUCTION In their review of back-propagation in layered networks, Rumelhart et al. (1986) describe the learning process in terms of finding good "internal representations" of the input patterns on the hidden units. However, the search for these representations isan indirect one, since the variables which are adjusted in its course are the connection weights, not the activations of the hidden units themselves when specific input patterns are fed into the input layer. Rather, the internal representations are represented implicitly in the connection weight values. More recently, Grossman et al. (1988 and 1989)1 suggested a way in which the search for internal representations could be made much more explicit.


The Perceptron Algorithm Is Fast for Non-Malicious Distributions

Neural Information Processing Systems

Interest in this algorithm waned in the 1970's after it was emphasized[Minsky andPapert, 1969] (1) that the class of problems solvable by a single half space was limited, and (2) that the Perceptron algorithm, although converging infinite time, did not converge in polynomial time. In the 1980's, however, it has become evident that there is no hope of providing a learning algorithm which can learn arbitrary functions in polynomial time and much research has thus been restricted to algorithms which learn a function drawn from a particular class of functions. Moreover, learning theory has focused on protocols like that of [Valiant, 1984] where we seek to classify, not a fixed set of examples, but examples drawn from a probability distribution. This allows a natural notion of "generalization" . There are very few classes which have yet been proven learnable in polynomial time, and one of these is the class of half spaces. Thus there is considerable theoretical interest now in studying the problem of learning a single half space, and so it is natural to reexamine the Percept ron algorithm within the formalism of Valiant. The Perceptron Algorithm Is Fast for Non-Malicious Distributions 677 In Valiant's protocol, a class of functions is called learnable if there is a learning algorithm which works in polynomial time independent of the distribution D generating the examples. Under this definition the Perceptron learning algorithm is not a polynomial time learning algorithm. However we will argue in section 2 that this definition is too restrictive.