AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Universal Algorithmic Intelligence: A mathematical top->down approach

Hutter, Marcus

arXiv.org Artificial IntelligenceDec-1-2009

Sequential decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental prior probability distribution is known. Solomonoff's theory of universal induction formally solves the problem of sequence prediction for unknown prior distribution. We combine both ideas and get a parameter-free theory of universal Artificial Intelligence. We give strong arguments that the resulting AIXI model is the most intelligent unbiased agent possible. We outline how the AIXI model can formally solve a number of problem classes, including sequence prediction, strategic games, function minimization, reinforcement and supervised learning. The major drawback of the AIXI model is that it is uncomputable. To overcome this problem, we construct a modified algorithm AIXItl that is still effectively more intelligent than any other time t and length l bounded agent. The computation time of AIXItl is of the order t x 2^l. The discussion includes formal definitions of intelligence order relations, the horizon problem and relations of the AIXI theory to other AI approaches.

artificial intelligence, logic & formal reasoning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

cs/0701125

Country:

Europe (0.92)
North America > United States > Massachusetts > Middlesex County (0.28)
North America > United States > California (0.27)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Chess (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
(3 more...)

Add feedback

Penalized Likelihood Methods for Estimation of Sparse High Dimensional Directed Acyclic Graphs

Shojaie, Ali, Michailidis, George

arXiv.org Machine LearningNov-28-2009

Directed acyclic graphs (DAGs) are commonly used to represent causal relationships among random variables in graphical models. Applications of these models arise in the study of physical, as well as biological systems, where directed edges between nodes represent the influence of components of the system on each other. The general problem of estimating DAGs from observed data is computationally NP-hard, Moreover two directed graphs may be observationally equivalent. When the nodes exhibit a natural ordering, the problem of estimating directed graphs reduces to the problem of estimating the structure of the network. In this paper, we propose a penalized likelihood approach that directly estimates the adjacency matrix of DAGs. Both lasso and adaptive lasso penalties are considered and an efficient algorithm is proposed for estimation of high dimensional DAGs. We study variable selection consistency of the two penalties when the number of variables grows to infinity with the sample size. We show that although lasso can only consistently estimate the true network under stringent assumptions, adaptive lasso achieves this task under mild regularity conditions. The performance of the proposed methods is compared to alternative methods in simulated, as well as real, data examples.

artificial intelligence, dag, machine learning, (18 more...)

arXiv.org Machine Learning

0911.5439

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Likelihood-based semi-supervised model selection with applications to speech processing

White, Christopher M., Khudanpur, Sanjeev P., Wolfe, Patrick J.

arXiv.org Machine LearningNov-19-2009

In conventional supervised pattern recognition tasks, model selection is typically accomplished by minimizing the classification error rate on a set of so-called development data, subject to ground-truth labeling by human experts or some other means. In the context of speech processing systems and other large-scale practical applications, however, such labeled development data are typically costly and difficult to obtain. This article proposes an alternative semi-supervised framework for likelihood-based model selection that leverages unlabeled data by using trained classifiers representing each model to automatically generate putative labels. The errors that result from this automatic labeling are shown to be amenable to results from robust statistics, which in turn provide for minimax-optimal censored likelihood ratio tests that recover the nonparametric sign test as a limiting case. This approach is then validated experimentally using a state-of-the-art automatic speech recognition system to select between candidate word pronunciations using unlabeled speech data that only potentially contain instances of the words under test. Results provide supporting evidence for the utility of this approach, and suggest that it may also find use in other applications of machine learning.

artificial intelligence, machine learning, pronunciation, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/JSTSP.2010.2076050

0911.3944

Country: North America > United States (0.67)

Genre: Research Report (0.50)

Industry: Law > Civil Rights & Constitutional Law (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.67)
(2 more...)

Add feedback

Multilingual Part-of-Speech Tagging: Two Unsupervised Approaches

Naseem, T., Snyder, B., Eisenstein, J., Barzilay, R.

Journal of Artificial Intelligence ResearchNov-17-2009

We demonstrate the effectiveness of multilingual learning for unsupervised part-of-speech tagging. The central assumption of our work is that by combining cues from multiple languages, the structure of each becomes more apparent. We consider two ways of applying this intuition to the problem of unsupervised part-of-speech tagging: a model that directly merges tag structures for a pair of languages into a single sequence and a second model which instead incorporates multilingual context using latent variables. Both approaches are formulated as hierarchical Bayesian models, using Markov Chain Monte Carlo sampling techniques for inference. Our results demonstrate that by incorporating multilingual evidence we can achieve impressive performance gains across a range of scenarios. We also found that performance improves steadily as the number of available languages increases.

alignment, latent variable model, node model, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2843

AI Access Foundation

10630

Journal of Artificial Intelligence Research

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Africa > Middle East > Egypt > Giza Governorate > Giza (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Hierarchical Bayesian Model for Frame Representation

Chaâri, L., Pesquet, J. -C., Tourneret, J. -Y., Ciuciu, Ph., Benazza-Benyahia, A.

arXiv.org Machine LearningNov-15-2009

In many signal processing problems, it may be fruitful to represent the signal under study in a frame. If a probabilistic approach is adopted, it becomes then necessary to estimate the hyper-parameters characterizing the probability distribution of the frame coefficients. This problem is difficult since in general the frame synthesis operator is not bijective. Consequently, the frame coefficients are not directly observable. This paper introduces a hierarchical Bayesian model for frame representation. The posterior distribution of the frame coefficients and model hyper-parameters is derived. Hybrid Markov Chain Monte Carlo algorithms are subsequently proposed to sample from this posterior distribution. The generated samples are then exploited to estimate the hyper-parameters and the frame coefficients of the target signal. Validation experiments show that the proposed algorithms provide an accurate estimation of the frame coefficients and hyper-parameters. Application to practical problems of image denoising show the impact of the resulting Bayesian estimation on the recovered signal quality.

artificial intelligence, frame coefficient, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/TSP.2010.2055562

0911.2888

Country: Europe > France (0.93)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Managing Conversation Uncertainty in TutorJ

Cannella, Vincenzo (University of Palermo) | Pirrone, Roberto (University of Palermo)

AAAI ConferencesNov-3-2009

Uncertainty in natural language dialogue is often treated through stochastic models. Some of the authors already presented TutorJ that is an Intelligent Tutoring System, whose interaction with the user is very intensive, and makes use of both dialogic and graphical modality. When managing the interaction, the system needs to cope with uncertainty due to the understanding of the user's needs and wishes. In this paper we present the extended version of TutorJ, focusing on the new features added to its chatbot module. These features allow to merge deterministic and probabilistic reasoning in dialogue management, and in writing the rules of the system's procedural memory.

category, chatbot, interaction, (17 more...)

AAAI Conferences

2009 AAAI Fall Symposium Series

Country:

North America > United States > District of Columbia > Washington (0.04)
Europe > Italy > Sicily > Palermo (0.04)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)
(2 more...)

Add feedback

Towards Uniform Implementation of Architectural Diversity

Rosenbloom, Paul S. (University of Southern California)

AAAI ConferencesNov-3-2009

Multi-representational architectures exploit diversity to yield the breadth of capabilities required for intelligent behavior in the world, but in so doing can sacrifice too much of the complementary benefits of architectural uniformity. The proposal here is to couple the benefits of diversity and uniformity through establishment of a uniform graph-based implementation level for diverse architectures.

architecture, artificial intelligence, machine learning, (13 more...)

AAAI Conferences

2009 AAAI Fall Symposium Series

Country:

Asia > Middle East > Jordan (0.07)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Oceania > Fiji (0.05)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.30)

Add feedback

Promoting Motivation and Self-Regulated Learning Skills through Social Interactions in Agent-based Learning Environments

Biswas, Gautam (Vanderbilt University) | Jeong, Hogyeong (Vanderbilt University) | Roscoe, Rod (Vanderbilt University) | Sulcer, Brian (Vanderbilt University)

AAAI ConferencesNov-3-2009

We have developed computer environments that support learning by teaching and the use of self regulated learning (SRL) skills through interactions with virtual agents. More specifically, students teach a computer agent, Betty, and can monitor her progress by asking her questions and getting her to take quizzes. The system provides SRL support via dialog-embedded prompts by Betty, the teachable agent, and Mr. Davis, the mentor agent. Our primary goals have been to support learning in complex science domains and facilitate development of metacognitive skills. More recently, we have also employed sequence analysis schemes and hidden Markov model (HMM) methods for assigning context to and deriving aggregated student behavior sequences from activity data. These techniques allow us to go beyond analyses of individual behaviors, instead examining how these behaviors cohere in larger patterns. We discuss the information derived from these models, and draw inferences on students’ use of self-regulated learning strategies.

artificial intelligence, machine learning, student, (17 more...)

AAAI Conferences

2009 AAAI Fall Symposium Series

Country:

North America > United States > Tennessee > Davidson County > Nashville (0.14)
North America > United States > District of Columbia > Washington (0.14)
North America > Canada > Quebec > Montreal (0.04)
(3 more...)

Genre:

Research Report (0.68)
Instructional Material (0.68)

Industry:

Education > Assessment & Standards > Student Performance (0.46)
Education > Focused Education (0.40)
Education > Social Development & Welfare (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Causal Inference on Discrete Data using Additive Noise Models

Peters, Jonas, Janzing, Dominik, Schölkopf, Bernhard

arXiv.org Machine LearningNov-2-2009

Inferring causal relations by analyzing statistical dependences among observed random variables is a challenging task if no controlled randomized experiments are available. Socalled constraint-based approaches to causal discovery (Pearl, 2000; Spirtes et al., 1993) select among all directed acyclic graphs (DAGs) those that satisfy the Markov condition and the faithfulness assumption, i.e., those for which the observed independences are imposed by the structure rather than being a result of specific choices of parameters of the Bayesian network. These approaches are unable to distinguish among causal DAGs that impose the same independences. In particular, it is impossible to distinguish between X Y and Y X. More recently, several methods have been suggested that use not only conditional independences, but also more sophisticated properties of the joint distribution. For simplicity, we explain the ideas for the two variable setting since this case is particularly challenging. Kano & Shimizu (2003) use models Y f(X) N (1) where f is a linear function and N is additive noise that is independent of the hypothetical cause X. This is an example for an additive noise model from X to Y. Apart from trivial

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/TPAMI.2011.71

0911.0280

Country:

North America > United States (0.46)
Europe > Germany (0.28)
North America > Canada (0.28)
Asia > Japan (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.74)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Relaxed Survey Propagation for The Weighted Maximum Satisfiability Problem

Chieu, H.L., Lee, W.S.

Journal of Artificial Intelligence ResearchOct-30-2009

The survey propagation (SP) algorithm has been shown to work well on large instances of the random 3-SAT problem near its phase transition. It was shown that SP estimates marginals over covers that represent clusters of solutions. The SP-y algorithm generalizes SP to work on the maximum satisfiability (Max-SAT) problem, but the cover interpretation of SP does not generalize to SP-y. In this paper, we formulate the relaxed survey propagation (RSP) algorithm, which extends the SP algorithm to apply to the weighted Max-SAT problem. We show that RSP has an interpretation of estimating marginals over covers violating a set of clauses with minimal weight. This naturally generalizes the cover interpretation of SP. Empirically, we show that RSP outperforms SP-y and other state-of-the-art Max-SAT solvers on random Max-SAT instances. RSP also outperforms state-of-the-art weighted Max-SAT solvers on random weighted Max-SAT instances.

Add feedback