AITopics

Country: North America > United States > Massachusetts (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Krogh, Anders, Thorbergsson, C. I., Hertz, John A.

A Cost Function for Internal Representations

We introduce a cost function for learning in feed-forward neural networks which is an explicit function of the internal representation in addition to the weights. The learning problem can then be formulated as two simple perceptrons and a search for internal representations. Back-propagation is recovered as a limit. The frequency of successful solutions is better for this algorithm than for back-propagation when weights and hidden units are updated on the same timescale i.e. once every learning step. 1 INTRODUCTION In their review of back-propagation in layered networks, Rumelhart et al. (1986) describe the learning process in terms of finding good "internal representations" of the input patterns on the hidden units. However, the search for these representations is an indirect one, since the variables which are adjusted in its course are the connection weights, not the activations of the hidden units themselves when specific input patterns are fed into the input layer. Rather, the internal representations are represented implicitly in the connection weight values. More recently, Grossman et al. (1988 and 1989)1 suggested a way in which the search for internal representations could be made much more explicit.

algorithm, cost function, internal representation, (12 more...)

Country:

Europe > Denmark > Capital Region > Copenhagen (0.05)
North America > United States (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.36)

The Perceptron Algorithm Is Fast for Non-Malicious Distributions

Baum, Eric B.

Interest in this algorithm waned in the 1970's after it was emphasized[Minsky and Papert, 1969] (1) that the class of problems solvable by a single half space was limited, and (2) that the Perceptron algorithm, although converging in finite time, did not converge in polynomial time. In the 1980's, however, it has become evident that there is no hope of providing a learning algorithm which can learn arbitrary functions in polynomial time and much research has thus been restricted to algorithms which learn a function drawn from a particular class of functions. Moreover, learning theory has focused on protocols like that of [Valiant, 1984] where we seek to classify, not a fixed set of examples, but examples drawn from a probability distribution. This allows a natural notion of "generalization". There are very few classes which have yet been proven learnable in polynomial time, and one of these is the class of half spaces. Thus there is considerable theoretical interest now in studying the problem of learning a single half space, and so it is natural to reexamine the Percept ron algorithm within the formalism of Valiant.

algorithm, perceptron algorithm, probability, (14 more...)

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Bourlard, Hervé, Morgan, Nelson

A Continuous Speech Recognition System Embedding MLP into HMM

We are developing a phoneme based.

mlp, probability, recognition, (13 more...)

Country:

North America > United States > New York (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.91)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(3 more...)

Krogh, Anders, Thorbergsson, C. I., Hertz, John A.

A Cost Function for Internal Representations

We introduce a cost function for learning in feed-forward neural networks which is an explicit function of the internal representation in addition to the weights. The learning problem can then be formulated as two simple perceptrons and a search for internal representations. Back-propagation is recovered as a limit. The frequency of successful solutions is better for this algorithm than for back-propagation when weights and hidden units are updated on the same timescale i.e. once every learning step. 1 INTRODUCTION In their review of back-propagation in layered networks, Rumelhart et al. (1986) describe the learning process in terms of finding good "internal representations" of the input patterns on the hidden units. However, the search for these representations is an indirect one, since the variables which are adjusted in its course are the connection weights, not the activations of the hidden units themselves when specific input patterns are fed into the input layer. Rather, the internal representations are represented implicitly in the connection weight values. More recently, Grossman et al. (1988 and 1989)1 suggested a way in which the search for internal representations could be made much more explicit.

algorithm, cost function, internal representation, (12 more...)

Country:

Europe > Denmark > Capital Region > Copenhagen (0.05)
North America > United States (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.36)

The Perceptron Algorithm Is Fast for Non-Malicious Distributions

Baum, Eric B.

Interest in this algorithm waned in the 1970's after it was emphasized[Minsky and Papert, 1969] (1) that the class of problems solvable by a single half space was limited, and (2) that the Perceptron algorithm, although converging in finite time, did not converge in polynomial time. In the 1980's, however, it has become evident that there is no hope of providing a learning algorithm which can learn arbitrary functions in polynomial time and much research has thus been restricted to algorithms which learn a function drawn from a particular class of functions. Moreover, learning theory has focused on protocols like that of [Valiant, 1984] where we seek to classify, not a fixed set of examples, but examples drawn from a probability distribution. This allows a natural notion of "generalization". There are very few classes which have yet been proven learnable in polynomial time, and one of these is the class of half spaces. Thus there is considerable theoretical interest now in studying the problem of learning a single half space, and so it is natural to reexamine the Percept ron algorithm within the formalism of Valiant.

algorithm, perceptron algorithm, probability, (14 more...)

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

The CHIR Algorithm for Feed Forward Networks with Binary Weights

Grossman, Tal

A new learning algorithm, Learning by Choice of Internal Represetations (CHIR), was recently introduced. Whereas many algorithms reduce the learning process to minimizing a cost function over the weights, our method treats the internal representations as the fundamental entities to be determined. The algorithm applies a search procedure in the space of internal representations, and a cooperative adaptation of the weights (e.g. by using the perceptron learning rule). Since the introduction of its basic, single output version, the CHIR algorithm was generalized to train any feed forward network of binary neurons. Here we present the generalised version of the CHIR algorithm, and further demonstrate its versatility by describing how it can be modified in order to train networks with binary ( 1) weights. Preliminary tests of this binary version on the random teacher problem are also reported.

algorithm, feed forward network, internal representation, (10 more...)

Country:

North America > United States > New York (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > France (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Bourlard, Hervé, Morgan, Nelson

A Continuous Speech Recognition System Embedding MLP into HMM

We are developing a phoneme based.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Country:

North America > United States > California (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(3 more...)

Krogh, Anders, Thorbergsson, C. I., Hertz, John A.

A Cost Function for Internal Representations

We introduce a cost function for learning in feed-forward neural networks which is an explicit function of the internal representation inaddition to the weights. The learning problem can then be formulated as two simple perceptrons and a search for internal representations. Back-propagation is recovered as a limit. The frequency of successful solutions is better for this algorithm than for back-propagation when weights and hidden units are updated on the same timescale i.e. once every learning step. 1 INTRODUCTION In their review of back-propagation in layered networks, Rumelhart et al. (1986) describe the learning process in terms of finding good "internal representations" of the input patterns on the hidden units. However, the search for these representations isan indirect one, since the variables which are adjusted in its course are the connection weights, not the activations of the hidden units themselves when specific input patterns are fed into the input layer. Rather, the internal representations are represented implicitly in the connection weight values. More recently, Grossman et al. (1988 and 1989)1 suggested a way in which the search for internal representations could be made much more explicit.

artificial intelligence, internal representation, machine learning, (14 more...)

Country: Europe > Denmark (0.16)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.36)

The Perceptron Algorithm Is Fast for Non-Malicious Distributions

Baum, Eric B.

Interest in this algorithm waned in the 1970's after it was emphasized[Minsky andPapert, 1969] (1) that the class of problems solvable by a single half space was limited, and (2) that the Perceptron algorithm, although converging infinite time, did not converge in polynomial time. In the 1980's, however, it has become evident that there is no hope of providing a learning algorithm which can learn arbitrary functions in polynomial time and much research has thus been restricted to algorithms which learn a function drawn from a particular class of functions. Moreover, learning theory has focused on protocols like that of [Valiant, 1984] where we seek to classify, not a fixed set of examples, but examples drawn from a probability distribution. This allows a natural notion of "generalization" . There are very few classes which have yet been proven learnable in polynomial time, and one of these is the class of half spaces. Thus there is considerable theoretical interest now in studying the problem of learning a single half space, and so it is natural to reexamine the Percept ron algorithm within the formalism of Valiant. The Perceptron Algorithm Is Fast for Non-Malicious Distributions 677 In Valiant's protocol, a class of functions is called learnable if there is a learning algorithm which works in polynomial time independent of the distribution D generating the examples. Under this definition the Perceptron learning algorithm is not a polynomial time learning algorithm. However we will argue in section 2 that this definition is too restrictive.

algorithm, artificial intelligence, machine learning, (17 more...)

Country: North America > United States (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)