AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Probabilistic Horn abduction and Bayesian networks

Poole, D.

ClassicsFeb-1-1993

This paper presents a simple framework for Horn-clause abduction, with probabilities associated with hypotheses. The framework incorporates assumptions about the rule base and independence assumptions amongst hypotheses. It is shown how any probabilistic knowledge representable in a discrete Bayesian belief network can be represented in this framework. The main contribution is in finding a relationship between logical and probabilistic notions of evidential reasoning. This provides a useful representation language in its own right, providing a compromise between heuristic and epistemic adequacy. It also shows how Bayesian networks can be extended beyond a propositional language.

artificial intelligence, bayesian inference, machine learning, (4 more...)

Classics

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Approximating probabilistic inference in Bayesian belief networks is NP-hard

Dagum, P. | Luby, M.

ClassicsFeb-1-1993

It is known that exact computation of conditional probabilities in belief networks is NP-hard. Many investigators in the AI community have tacitly assumed that algorithms for performing approximate inference with belief networks are of polynomial complexity. Indeed, special cases of approximate inference can be performed in time polynomial in the input size. However, we have discovered that the general problem of approximating conditional probabilities with belief networks, like exact inference, resides in the NP-hard complexity class. We develop a complexity analysis to elucidate the difficulty of approximate probabilistic inference.

artificial intelligence, belief network, machine learning, (8 more...)

Classics

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

The mathematics of statistical machine translation: Parameter estimation

Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., Mercer, R. L.

ClassicsFeb-1-1993

English-to-French I f _[ French-to-English We present an overview of Candide, a system for automatic e Channel " -] Decoder 6 translation of French text to English text.

probability, translation, translation model, (15 more...)

Classics

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Best-First Model Merging for Dynamic Learning and Recognition

Omohundro, Stephen M.

Neural Information Processing SystemsDec-31-1992

"Best-first model merging" is a general technique for dynamically choosing the structure of a neural or related architecture while avoiding overfitting. It is applicable to both leaming and recognition tasks and often generalizes significantly better than fixed structures. We demonstrate the approach applied to the tasks of choosing radial basis functions for function learning, choosing local affine models for curve and constraint surface modelling, and choosing the structure of a balltree or bumptree to maximize efficiency of access.

artificial intelligence, best-first model merging, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.05)
North America > United States > California > San Mateo County > San Mateo (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
(2 more...)

Add feedback

Time-Warping Network: A Hybrid Framework for Speech Recognition

Levin, Esther, Pieraccini, Roberto, Bocchieri, Enrico

Neural Information Processing SystemsDec-31-1992

Such systems attempt to combine the best features of both models: the temporal structure of HMMs and the discriminative power of neural networks. In this work we define a time-warping (1W) neuron that extends the operation of the fonnal neuron of a back-propagation network by warping the input pattern to match it optimally to its weights. We show that a single-layer network of TW neurons is equivalent to a Gaussian density HMMbased recognition system.

neuron, recognizer, time-warping network, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > Canada > Ontario > Toronto (0.04)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.33)

Add feedback

Multi-State Time Delay Networks for Continuous Speech Recognition

Haffner, Patrick, Waibel, Alex

Neural Information Processing SystemsDec-31-1992

We present the "Multi-State Time Delay Neural Network" (MS-TDNN) as an extension of the TDNN to robust word recognition.

procedure, recognition, training procedure, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.95)

Add feedback

Best-First Model Merging for Dynamic Learning and Recognition

Omohundro, Stephen M.

Neural Information Processing SystemsDec-31-1992

best-first model merging, complex model, dynamic learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.05)
North America > United States > California > San Mateo County > San Mateo (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
(2 more...)

Add feedback

Unsupervised learning of distributions on binary vectors using two layer networks

Freund, Yoav, Haussler, David

Neural Information Processing SystemsDec-31-1992

We study a particular type of Boltzmann machine with a bipartite graph structure called a harmonium. Our interest is in using such a machine to model a probability distribution on binary input vectors. We analyze the class of probability distributions that can be modeled by such machines.

boltzmann machine, harmonium model, vector, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.37)

Add feedback

Estimating Average-Case Learning Curves Using Bayesian, Statistical Physics and VC Dimension Methods

Haussler, David, Kearns, Michael, Opper, Manfred, Schapire, Robert

Neural Information Processing SystemsDec-31-1992

In this paper we investigate an average-case model of concept learning, and give results that place the popular statistical physics and VC dimension theories of learning curve behavior in a common framework.

algorithm, gibbs algorithm, probability, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
North America > United States > New Jersey (0.05)
North America > United States > New York (0.04)
Europe > Germany (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Bayesian Model Comparison and Backprop Nets

MacKay, David J. C.

Neural Information Processing SystemsDec-31-1992

The Bayesian model comparison framework is reviewed, and the Bayesian Occam's razor is explained. This framework can be applied to feedforward networks, making possible (1) objective comparisons between solutions using alternative network architectures; (2) objective choice of magnitude and type of weight decay terms; (3) quantified estimates of the error bars on network parameters and on network output. The framework also generates a measure of the effective number of parameters determined by the data. The relationship of Bayesian model comparison to recent work on prediction of generalisation ability (Guyon et al., 1992, Moody, 1992) is discussed.

error bar, inference, occam factor, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > United Kingdom (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback