AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

The data association problem when monitoring robot vehicles using dynamic belief networks

Nicholson, A. | Brady, J. M.

ClassicsFeb-1-1992

Get Citation Alerts New Citation Alert added!

artificial intelligence, data association problem, machine learning, (11 more...)

Classics

Country:

North America > United States (0.06)
Europe > Germany > Hamburg (0.06)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

A Bayesian method for the induction of probabilistic networks from data

Cooper, G. | Herskovits, E.

ClassicsFeb-1-1992

This paper presents a Bayesian method for constructing probabilistic networks from databases. In particular, we focus on constructing Bayesian belief networks. Potential applications include computer-assisted hypothesis testing, automated scientific discovery, and automated construction of probabilistic expert systems. We extend the basic method to handle missing data and hidden (latent) variables. We show how to perform probabilistic inference by averaging over the inferences of multiple belief networks.

artificial intelligence, machine learning, probabilistic network, (5 more...)

Classics

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Convergence of a Neural Network Classifier

Baras, John S., LaVigna, Anthony

Neural Information Processing SystemsDec-31-1991

In this paper, we prove that the vectors in the LVQ learning algorithm converge. We do this by showing that the learning algorithm performs stochastic approximation. Convergence is then obtained by identifying the appropriate conditions on the learning rate and on the underlying statistics of the classification problem. We also present a modification to the learning algorithm which we argue results in convergence of the LVQ error to the Bayesian optimal error as the appropriate parameters become large.

algorithm, vector, voronoi vector, (10 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Prince George's County > College Park (0.15)
North America > United States > Texas (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.42)

Add feedback

Asymptotic slowing down of the nearest-neighbor classifier

Snapp, Robert R., Psaltis, Demetri, Venkatesh, Santosh S.

Neural Information Processing SystemsDec-31-1991

M2/n' for sufficiently large values of M. Here, Poo(error) denotes the probability of error in the infinite sample limit, and is at most twice the error of a Bayes classifier. Although the value of the coefficient a depends upon the underlying probability distributions, the exponent of M is largely distribution free. We thus obtain a concise relation between a classifier's ability to generalize from a finite reference sample and the dimensionality of the feature space, as well as an analytic validation of Bellman's well known "curse of dimensionality." 1 INTRODUCTION One of the primary tasks assigned to neural networks is pattern classification. Common applications include recognition problems dealing with speech, handwritten characters, DNA sequences, military targets, and (in this conference) sexual identity. Two fundamental concepts associated with pattern classification are generalization (how well does a classifier respond to input data it has never encountered before?) and scalability (how are a classifier's processing and training requirements affected by increasing the number of features that describe the input patterns?).

classifier, feature space, probability, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Vermont > Chittenden County > Burlington (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York (0.04)
(3 more...)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.58)

Add feedback

Transforming Neural-Net Output Levels to Probability Distributions

Denker, John S., LeCun, Yann

Neural Information Processing SystemsDec-31-1991

John S. Denker and Yann leCun AT&T Bell Laboratories Holmdel, NJ 07733 Abstract (1) The outputs of a typical multi-output classification network do not satisfy the axioms of probability; probabilities should be positive and sum to one. This problem can be solved by treating the trained network as a preprocessor that produces a feature vector that can be further processed, for instance by classical statistical estimation techniques. It is particularly useful to combine these two ideas: we implement the ideas of section 1 using Parzen windows, where the shape and relative size of each window is computed using the ideas of section 2. This allows us to make contact between important theoretical ideas (e.g. the ensemble formalism) and practical techniques (e.g. Our results also shed new light on and generalize the well-known "soft max" scheme. For example, in speech recognition, these numbers represent the probability of C different phonemes; the probabilities of successive segments can be combined using a Hidden Markov Model.

probability, training data, transforming neural-net output level, (11 more...)

Neural Information Processing Systems

Country: North America > United States > District of Columbia > Washington (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

A Method for the Efficient Design of Boltzmann Machines for Classiffication Problems

Gupta, Ajay, Maass, Wolfgang

Neural Information Processing SystemsDec-31-1991

A Boltzmann machine ([AHS], [HS], [AK]) is a neural network model in which the units update their states according to a stochastic decision rule. It consists of a set U of units, a set C of unordered pairs of elements of U, and an assignment of connection strengths S: C -- R. A configuration of a Boltzmann machine is a map k: U -- {O, I}.

boltzmann machine, node, state change, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > New York (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)

Add feedback

On Stochastic Complexity and Admissible Models for Neural Network Classifiers

Smyth, Padhraic

Neural Information Processing SystemsDec-31-1991

For a detailed rationale the reader is referred to the work of Rissanen (1984) or Wallace and Freeman (1987) and the references therein. Note that the Minimum Description Length (MDL) technique (as Rissanen's approach has become known) is implicitly related to Maximum A Posteriori (MAP) Bayesian estimation techniques if cast in the appropriate framework.

admissible model, classification problem, description length, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Extensions of a Theory of Networks for Approximation and Learning: Outliers and Negative Examples

Girosi, Federico, Poggio, Tomaso, Caprile, Bruno

Neural Information Processing SystemsDec-31-1991

Learning an input-output mapping from a set of examples can be regarded as synthesizing an approximation of a multidimensional function.

extension, girosi, negative example, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > District of Columbia > Washington (0.04)
Europe > Italy (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.44)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.44)

Add feedback

RecNorm: Simultaneous Normalisation and Classification applied to Speech Recognition

Bridle, John S., Cox, Stephen J.

Neural Information Processing SystemsDec-31-1991

A particular form of neural network is described, which has terminals for acoustic patterns, class labels and speaker parameters. A method of training this network to "tune in" the speaker parameters to a particular speaker is outlined, based on a trick for converting a supervised network to an unsupervised mode. We describe experiments using this approach in isolated word recognition based on whole-word hidden Markov models. The results indicate an improvement over speaker-independent performance and, for unlabelled data, a performance close to that achieved on labelled data. 1 INTRODUCTION We are concerned to emulate some aspects of perception. In particular, the way that a stimulus which is ambiguous, perhaps because of unknown lighting conditions, can become unambiguous in the context of other such stimuli: the fact that they are subject to tbe same unknown conditions gives our perceptual apparatus enough constraints to solve tbe problem. Individual words are often ambiguous even to human listeners. For instance a Cockney might say the word "ace" to sound the same as a Standard English speaker's "ice". Similarly with "room" and "rum", or "work" and "walk" ill other pairs of British English accents. If we heard one of these ambiguous pronunciations, knowing nothing else about the speaker we could not tell which word had been said.

adaptation, transformation, utterance, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.92)

Add feedback

Speech Recognition Using Demi-Syllable Neural Prediction Model

Iso, Ken-ichi, Watanabe, Takao

Neural Information Processing SystemsDec-31-1991

The Neural Prediction Model is the speech recognition model based on pattern prediction by multilayer perceptrons. Its effectiveness was confirmed by the speaker-independent digit recognition experiments. This paper presents an improvement in the model and its application to large vocabulary speech recognition, based on subword units. The improvement involves an introduction of "backward prediction," which further improves the prediction accuracy of the original model with only "forward prediction". In application of the model to speaker-dependent large vocabulary speech recognition, the demi-syllable unit is used as a subword recognition unit.

feature vector, prediction, recognition, (12 more...)

Neural Information Processing Systems

Country: Asia > Japan (0.04)

Genre: Research Report > Promising Solution (0.35)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Add feedback