AITopics | Denker, John S.

Collaborating Authors

Denker, John S.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Curves: Asymptotic Values and Rate of Convergence

Cortes, Corinna, Jackel, L. D., Solla, Sara A., Vapnik, Vladimir, Denker, John S.

Neural Information Processing SystemsDec-31-1994

Training classifiers on large databases is computationally demanding. It is desirable to develop efficient procedures for a reliable prediction of a classifier's suitability for implementing a given task, so that resources can be assigned to the most promising candidates or freed for exploring new classifier candidates. We propose such a practical and principled predictive method. Practical because it avoids the costly procedure of training poor classifiers on the whole training set, and principled because of its theoretical foundation. The effectiveness of the proposed procedure is demonstrated for both single-and multi-layer networks.

artificial intelligence, classifier, machine learning, (16 more...)

Neural Information Processing Systems

Industry: Materials > Chemicals (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Curves: Asymptotic Values and Rate of Convergence

Cortes, Corinna, Jackel, L. D., Solla, Sara A., Vapnik, Vladimir, Denker, John S.

Neural Information Processing SystemsDec-31-1994

Training classifiers on large databases is computationally demanding. Itis desirable to develop efficient procedures for a reliable prediction of a classifier's suitability for implementing a given task, so that resources can be assigned to the most promising candidates or freed for exploring new classifier candidates. We propose such a practical and principled predictive method. Practical because it avoids the costly procedure of training poor classifiers on the whole training set, and principled because of its theoretical foundation. The effectiveness of the proposed procedure is demonstrated for both single-and multi-layer networks.

artificial intelligence, classifier, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Curves: Asymptotic Values and Rate of Convergence

Cortes, Corinna, Jackel, L. D., Solla, Sara A., Vapnik, Vladimir, Denker, John S.

Neural Information Processing SystemsDec-31-1994

artificial intelligence, classifier, machine learning, (16 more...)

Neural Information Processing Systems

Industry: Materials > Chemicals (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Efficient Pattern Recognition Using a New Transformation Distance

Simard, Patrice, LeCun, Yann, Denker, John S.

Neural Information Processing SystemsDec-31-1993

Memory-based classification algorithms such as radial basis functions orK-nearest neighbors typically rely on simple distances (Euclidean, dotproduct ...), which are not particularly meaningful on pattern vectors. More complex, better suited distance measures are often expensive and rather ad-hoc (elastic matching, deformable templates). We propose a new distance measure which (a) can be made locally invariant to any set of transformations of the input and (b) can be computed efficiently. We tested the method on large handwritten character databases provided by the Post Office and the NIST. Using invariances with respect to translation, rotation, scaling,shearing and line thickness, the method consistently outperformed all other systems tested on the same databases.

artificial intelligence, post office, tangent distance, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.94)

Industry:

Government > Post Office (0.66)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)

Add feedback

Multi-Digit Recognition Using a Space Displacement Neural Network

Matan, Ofer, Burges, Christopher J. C., LeCun, Yann, Denker, John S.

Neural Information Processing SystemsDec-31-1992

This is an extension of previous work on recognizing isolated digits.

artificial intelligence, neural network, recognition, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Multi-Digit Recognition Using a Space Displacement Neural Network

Matan, Ofer, Burges, Christopher J. C., LeCun, Yann, Denker, John S.

Neural Information Processing SystemsDec-31-1992

We present a feed-forward network architecture for recognizing an unconstrained handwritten multi-digit string. This is an extension of previous work on recognizing isolated digits. In this architecture a single digit recognizer is replicated over the input. The output layer of the network is coupled to a Viterbi alignment module that chooses the best interpretation of the input. Training errors are propagated through the Viterbi module.

artificial intelligence, neural network, recognition, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Transforming Neural-Net Output Levels to Probability Distributions

Denker, John S., LeCun, Yann

Neural Information Processing SystemsDec-31-1991

John S. Denker and Yann leCun AT&T Bell Laboratories Holmdel, NJ 07733 Abstract (1) The outputs of a typical multi-output classification network do not satisfy the axioms of probability; probabilities should be positive and sum to one. This problem can be solved by treating the trained network as a preprocessor that produces a feature vector that can be further processed, for instance by classical statistical estimation techniques. It is particularly useful to combine these two ideas: we implement the ideas of section 1 using Parzen windows, where the shape and relative size of each window is computed using the ideas of section 2. This allows us to make contact between important theoretical ideas (e.g. the ensemble formalism) and practical techniques (e.g. Our results also shed new light on and generalize the well-known "softmax" scheme. 1 Distribution of Categories in Output Space In many neural-net applications, it is crucial to produce a set of C numbers that serve as estimates of the probability of C mutually exclusive outcomes. For example, inspeech recognition, these numbers represent the probability of C different phonemes; the probabilities of successive segments can be combined using a Hidden Markov Model.

artificial intelligence, neural network, probability, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

Transforming Neural-Net Output Levels to Probability Distributions

Denker, John S., LeCun, Yann

Neural Information Processing SystemsDec-31-1991

John S. Denker and Yann leCun AT&T Bell Laboratories Holmdel, NJ 07733 Abstract (1) The outputs of a typical multi-output classification network do not satisfy the axioms of probability; probabilities should be positive and sum to one. This problem can be solved by treating the trained network as a preprocessor that produces a feature vector that can be further processed, for instance by classical statistical estimation techniques. It is particularly useful to combine these two ideas: we implement the ideas of section 1 using Parzen windows, where the shape and relative size of each window is computed using the ideas of section 2. This allows us to make contact between important theoretical ideas (e.g. the ensemble formalism) and practical techniques (e.g. Our results also shed new light on and generalize the well-known "soft max" scheme. For example, in speech recognition, these numbers represent the probability of C different phonemes; the probabilities of successive segments can be combined using a Hidden Markov Model.

artificial intelligence, neural network, probability, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

Handwritten Digit Recognition with a Back-Propagation Network

LeCun, Yann, Boser, Bernhard E., Denker, John S., Henderson, Donnie, Howard, R. E., Hubbard, Wayne E., Jackel, Lawrence D.

Neural Information Processing SystemsDec-31-1990

We present an application of back-propagation networks to handwritten digitrecognition. Minimal preprocessing of the data was required, but architecture of the network was highly constrained and specifically designed for the task. The input of the network consists of normalized images of isolated digits. The method has 1 % error rate and about a 9% reject rate on zipcode digits provided by the U.S. Postal Service. 1 INTRODUCTION The main point of this paper is to show that large back-propagation (BP) networks canbe applied to real image-recognition problems without a large, complex preprocessing stage requiring detailed engineering. Unlike most previous work on the subject (Denker et al., 1989), the learning network is directly fed with images, rather than feature vectors, thus demonstrating the ability of BP networks to deal with large amounts of low level information. Previous work performed on simple digit images (Le Cun, 1989) showed that the architecture of the network strongly influences the network's generalization ability. Good generalization can only be obtained by designing a network architecture that contains a certain amount of a priori knowledge about the problem. The basic design principleis to minimize the number of free parameters that must be determined by the learning algorithm, without overly reducing the computational power of the network.

feature map, neural network, us government, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe > Switzerland > Zürich > Zürich (0.14)

Industry:

Government > Post Office (0.67)
Government > Regional Government > North America Government > United States Government (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Optimal Brain Damage

LeCun, Yann, Denker, John S., Solla, Sara A.

Neural Information Processing SystemsDec-31-1990

We have used information-theoretic ideas to derive a class of practical and nearly optimal schemes for adapting the size of a neural network. By removing unimportant weights from a network, several improvements can be expected: better generalization, fewer training examples required, and improved speed of learning and/or classification. The basic idea is to use second-derivative information to make a tradeoff between network complexity and training set error. Experiments confirm the usefulness of the methods on a real-world application. 1 INTRODUCTION Most successful applications of neural network learning to real-world problems have been achieved using highly structured networks of rather large size [for example (Waibel, 1989; Le Cun et al., 1990a)]. As applications become more complex, the networks will presumably become even larger and more structured.

le cun, neural network, neurology, (17 more...)

Neural Information Processing Systems

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback