AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

High-Dimensional Inference with the generalized Hopfield Model: Principal Component Analysis and Corrections

Cocco, Simona, Monasson, Remi, Sessak, Vitor

arXiv.org Machine LearningApr-19-2011

Understanding the patterns of correlations between the components of complex systems is a fundamental issue in various scientific fields, ranging from neurobiology to genomic, from finance to sociology,... A recurrent problem is to distinguish between direct correlations, produced by physiological or functional interactions between the components, and network correlations, which are mediated by other, third-party components. Various approaches have been proposed to infer interactions from correlations, exploiting concepts related to statistical dimensional reduction [1], causality [2], the maximum entropy principle [3], Markov random fields [4]... A major practical and theoretical difficulty in doing so is the paucity and the quality of data: reliable analysis should be able to unveil real patterns of interactions, even if measures are affected by under-or noisy sampling. The size of the interaction network can be comparable to or larger than the number of data, a situation referred to as highdimensional inference. The purpose of the present work is to establish a quantitative correspondence between two of those approaches, namely the inference of Boltzmann Machines (also called Ising model in statistical physics and undirected graphical models for discrete variables in statistical inference [4]) and Principal Component Analysis (PCA) [1]. Inverse Boltzmann Machines (BM) are a mathematically well-founded but computationally challenging approach to infer interactions from correlations.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevE.83.051123

1104.3665

Country:

Europe (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Understanding Exhaustive Pattern Learning

Shen, Libin

arXiv.org Artificial IntelligenceApr-19-2011

Pattern learning in an important problem in Natural Language Processing (NLP). Some exhaustive pattern learning (EPL) methods (Bod, 1992) were proved to be flawed (Johnson, 2002), while similar algorithms (Och and Ney, 2004) showed great advantages on other tasks, such as machine translation. In this article, we first formalize EPL, and then show that the probability given by an EPL model is constant-factor approximation of the probability given by an ensemble method that integrates exponential number of models obtained with various segmentations of the training data. This work for the first time provides theoretical justification for the widely used EPL algorithm in NLP, which was previously viewed as a flawed heuristic method. Better understanding of EPL may lead to improved pattern learning algorithms in future.

machine learning, natural language, segmentation, (18 more...)

arXiv.org Artificial Intelligence

1104.3929

Country: North America > United States (0.68)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.69)

Add feedback

Automatic Discovery and Transfer of Task Hierarchies in Reinforcement Learning

Mehta, Neville (Oregon State University) | Ray, Soumya (Case Western Reserve University) | Tadepalli, Prasad (Oregon State University) | Dietterich, Thomas (Oregon State University)

AI MagazineApr-18-2011

Sequential decision tasks present many opportunities for the study of transfer learning. A principal one among them is the existence of multiple domains that share the same underlying causal structure for actions. We describe an approach that exploits this shared causal structure to discover a hierarchical task structure in a source domain, which in turn speeds up learning of task execution knowledge in a new target domain. Our approach is theoretically justiﬁed and compares favorably to manually designed task hierarchies in learning efﬁciency in the target domain. We demonstrate that causally motivated task hierarchies transfer more robustly than other kinds of detailed knowledge that depend on the idiosyncrasies of the source domain and are hence less transferable.

hierarchy, task hierarchy, trajectory, (17 more...)

AI Magazine

Country:

North America > United States > New York (0.04)
North America > United States > Oregon (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(5 more...)

Industry:

Leisure & Entertainment (0.46)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Deep Transfer: A Markov Logic Approach

Davis, Jesse (Katholieke Universiteit Leuven) | Domingos, Pedro (University of Washington)

AI MagazineApr-18-2011

This article argues that currently the largest gap between human and machine learning is learning algorithms' inability to perform deep transfer, that is, generalize from one domain to another domain containing different objects, classes, properties and relations. We argue that second-order Markov logic is ideally suited for this purpose, and propose an approach based on it. Our algorithm discovers structural regularities in the source domain in the form of Markov logic formulas with predicate variables, and instantiates these formulas with predicates from the target domain. Our approach has successfully transferred learned knowledge among molecular biology, Web and social network domains.

artificial intelligence, formula, machine learning, (14 more...)

AI Magazine

Country: North America > United States > Wisconsin (0.15)

Genre: Personal > Honors (0.48)

Industry:

Information Technology (0.50)
Government (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)

Add feedback

Bayesian inference for queueing networks and modeling of internet services

Sutton, Charles, Jordan, Michael I.

arXiv.org Machine LearningApr-15-2011

Modern Internet services, such as those at Google, Yahoo!, and Amazon, handle billions of requests per day on clusters of thousands of computers. Because these services operate under strict performance requirements, a statistical understanding of their performance is of great practical interest. Such services are modeled by networks of queues, where each queue models one of the computers in the system. A key challenge is that the data are incomplete, because recording detailed information about every request to a heavily used system can require unacceptable overhead. In this paper we develop a Bayesian perspective on queueing models in which the arrival and departure times that are not observed are treated as latent variables. Underlying this viewpoint is the observation that a queueing model defines a deterministic transformation between the data and a set of independent variables called the service times. With this viewpoint in hand, we sample from the posterior distribution over missing data and model parameters using Markov chain Monte Carlo. We evaluate our framework on data from a benchmark Web application. We also present a simple technique for selection among nested queueing models. We are unaware of any previous work that considers inference in networks of queues in the presence of missing data.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1214/10-AOAS392

1001.3355

Country: North America > United States > California (0.46)

Genre: Research Report (1.00)

Industry:

Consumer Products & Services > Travel (0.39)
Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Communications > Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Unified Treatment of Hidden Markov Switching Models

Chiappa, Silvia

arXiv.org Machine LearningApr-11-2011

Several problems encountered in application areas such as finance, biology, speech analysis, control engineering, robotics, etc. require the modeling of time-series containing switching among different dynamics regimes (see Ephraim (2002) for a review). For example, system fault diagnosis deals with detecting behavioural deviations from normality originated by failures in the system. Such a modeling is often achieved by employing probabilistic approaches in which regime switching is described by a set of discrete hidden random variables, related by a first-order Markovian dependence. All such models, that we call hidden Markov switching models (HMSMs), can be viewed as extensions of the popular hidden Markov model Rabiner (1989). The wide interdisciplinary attention to this research area has produced many different HMSMs as well as different approaches and implementations of HMSMs of fundamentally similar structure, resulting in a dense literature from which extracting differences and commonalities among models is often challenging. In this paper we provide a simple unified treatment of existing HMSMs, highlighting properties and connections that were not observed in previous review papers Ephraim (2002); Gales and Young (1993); Murphy (2002); Ostendorf et al. (1996); Rabiner (1989); Yu (2010), and introduce novel extensions. Our exposition enables a deep understanding of the fundamental structure and relations of different approaches. This is achieved by using the framework of graphical models, which allows to easily define complex models by using a graphical representation and to derive efficient inference routines by visual inspection of the graph, avoiding complex algebraic manipulations.

artificial intelligence, machine learning, regime, (18 more...)

arXiv.org Machine Learning

1104.1992

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model

Shimizu, Shohei, Inazumi, Takanori, Sogawa, Yasuhiro, Hyvarinen, Aapo, Kawahara, Yoshinobu, Washio, Takashi, Hoyer, Patrik O., Bollen, Kenneth

arXiv.org Machine LearningApr-7-2011

Structural equation models and Bayesian networks have been widely used to analyze causal relations between continuous variables. In such frameworks, linear acyclic models are typically used to model the data-generating process of variables. Recently, it was shown that use of non-Gaussianity identifies the full structure of a linear acyclic model, i.e., a causal ordering of variables and their connection strengths, without using any prior knowledge on the network structure, which is not the case with conventional methods. However, existing estimation methods are based on iterative search algorithms and may not converge to a correct solution in a finite number of steps. In this paper, we propose a new direct method to estimate a causal ordering and connection strengths based on non-Gaussianity. In contrast to the previous methods, our algorithm requires no algorithmic parameters and is guaranteed to converge to the right solution within a small fixed number of steps if the data strictly follows the model.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1101.2489

Country: North America > United States > North Carolina (0.28)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments

Sun, Yi, Gomez, Faustino, Schmidhuber, Juergen

arXiv.org Machine LearningMar-29-2011

To maximize its success, an AGI typically needs to explore its initially unknown world. Is there an optimal way of doing so? Here we derive an affirmative answer for a broad class of environments.

artificial intelligence, information gain, machine learning, (15 more...)

arXiv.org Machine Learning

1103.5708

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Add feedback

Classification of Sets using Restricted Boltzmann Machines

Louradour, Jérôme, Larochelle, Hugo

arXiv.org Machine LearningMar-24-2011

We consider the problem of classification when inputs correspond to sets of vectors. This setting occurs in many problems such as the classification of pieces of mail containing several pages, of web sites with several sections or of images that have been pre-segmented into smaller regions. We propose generalizations of the restricted Boltzmann machine (RBM) that are appropriate in this context and explore how to incorporate different assumptions about the relationship between the input sets and the target class within the RBM. In experiments on standard multiple-instance learning datasets, we demonstrate the competitiveness of approaches based on RBMs and apply the proposed variants to the problem of incoming mail classification.

artificial intelligence, machine learning, vector, (16 more...)

arXiv.org Machine Learning

1103.4896

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)

Add feedback

Artificial Intelligence and Risk Communication

Green, Nancy L. (University of North Carolina Greensboro)

AAAI ConferencesMar-19-2011

The challenges of effective health risk communication are well known. This paper provides pointers to the health communication literature that discuss these problems. Tailoring printed information, visual displays, and interactive multimedia have been proposed in the health communication literature as promising approaches. On-line risk communication applications are increasing on the internet. However, potential effectiveness of applications using conventional computer technology is limited. We propose that use of artificial intelligence, building upon research in Intelligent Tutoring Systems, might be able to overcome these limitations.

communication, information, risk communication, (12 more...)

AAAI Conferences

2011 AAAI Spring Symposium Series

Country:

North America > United States > North Carolina > Guilford County > Greensboro (0.14)
North America > United States > Oregon (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Burlington (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.72)
Education > Educational Technology > Educational Software > Computer Based Training (0.55)
Education > Educational Setting > K-12 Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback