AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Reasoning about Deterministic Actions with Probabilistic Prior and Application to Stochastic Filtering

Hajishirzi, Hannaneh (University of Illinois at Urbana-Champaign) | Amir, Eyal (University of Illinois at Urbana-Champaign)

AAAI ConferencesMay-9-2010

We present a novel algorithm and a new understanding of reasoning about a sequence of deterministic actions with a probabilistic prior. When the initial state of a dynamic system is unknown, a probability distribution can be still specified over the initial states. Estimating the posterior distribution over states filtering after some deterministic actions occurred is a problem relevant to AI planning, natural language processing (NLP), and robotics among others. Current approaches to filtering deterministic actions are not tractable even if the distribution over the initial system state is represented compactly. The reason is that state variables become correlated after a few steps. The main innovation in this paper is a method for sidestepping this problem by redefining state variables dynamically at each time step such that the posterior for time t is represented in a factored form. This update is done using a progression algorithm as a subroutine, and our algorithm's tractability follows when that subroutine is tractable. Our results are for general deterministic actions and in particular, our algorithm is tractable for one-to-one and STRIPS actions. We apply our reasoning algorithm about deterministic actions to reasoning about sequences of probabilistic actions and improve the efficiency of the current probabilistic reasoning approaches. We demonstrate the efficiency of the new algorithm empirically over AI-Planning data sets.

algorithm, probability, sequence, (17 more...)

AAAI Conferences

Twelfth International Conference on the Principles of Knowledge Representation and Reasoning

Country:

North America > United States > Illinois (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre:

Workflow (0.47)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

Novel Semantical Approaches to Relational Probabilistic Conditionals

Kern-Isberner, Gabriele (Technische Universität Dortmund) | Thimm, Matthias (Technische Universität Dortmund)

AAAI ConferencesMay-9-2010

It seems to be a common view that in order to interpret probabilistic first-order sentences, either a statistical approach that counts (tuples of) individuals has to be used, or the knowledge base has to be grounded to make a possible worlds semantics applicable, for a subjective interpretation of probabilities. In this paper, we propose novel semantical perspectives on first-order (or relational) probabilistic conditionals that are motivated by considering them as subjective, but population-based statements. We propose two different semantics for relational probabilistic conditionals, and a set of postulates for suitable inference operators in this framework. Finally, we present two inference operators by applying the maximum entropy principle to the respective model theories. Both operators are shown to yield reasonable inferences according to the postulates.

knowledge base, probability, prototypical indifference, (11 more...)

AAAI Conferences

Twelfth International Conference on the Principles of Knowledge Representation and Reasoning

Country:

Europe > Germany (0.04)
North America > United States > Illinois (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)

Add feedback

Active Learning for Hidden Attributes in Networks

Yan, Xiaoran, Zhu, Yaojia, Rouquier, Jean-Baptiste, Moore, Cristopher

arXiv.org Machine LearningMay-5-2010

In many networks, vertices have hidden attributes, or types, that are correlated with the networks topology. If the topology is known but these attributes are not, and if learning the attributes is costly, we need a method for choosing which vertex to query in order to learn as much as possible about the attributes of the other vertices. We assume the network is generated by a stochastic block model, but we make no assumptions about its assortativity or disassortativity. We choose which vertex to query using two methods: 1) maximizing the mutual information between its attributes and those of the others (a well-known approach in active learning) and 2) maximizing the average agreement between two independent samples of the conditional Gibbs distribution. Experimental results show that both these methods do much better than simple heuristics. They also consistently identify certain vertices as important by querying them early on.

algorithm, block model, vertex, (13 more...)

arXiv.org Machine Learning

1005.0794

Country:

Southern Ocean > Weddell Sea (0.05)
North America > United States > Oklahoma > Payne County > Cushing (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

The Production of Probabilistic Entropy in Structure/Action Contingency Relations

Leydesdorff, Loet

arXiv.org Artificial IntelligenceMay-5-2010

Luhmann (1984) defined society as a communication system which is structurally coupled to, but not an aggregate of, human action systems. The communication system is then considered as self-organizing ("autopoietic"), as are human actors. Communication systems can be studied by using Shannon's (1948) mathematical theory of communication. The update of a network by action at one of the local nodes is then a well-known problem in artificial intelligence (Pearl 1988). By combining these various theories, a general algorithm for probabilistic structure/action contingency can be derived. The consequences of this contingency for each system, its consequences for their further histories, and the stabilization on each side by counterbalancing mechanisms are discussed, in both mathematical and theoretical terms. An empirical example is elaborated.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Artificial Intelligence

1005.0707

Country:

North America > United States (0.68)
Europe (0.68)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data

Shah, Mohak, Marchand, Mario, Corbeil, Jacques

arXiv.org Artificial IntelligenceMay-4-2010

One of the objectives of designing feature selection learning algorithms is to obtain classifiers that depend on a small number of attributes and have verifiable future performance guarantees. There are few, if any, approaches that successfully address the two goals simultaneously. Performance guarantees become crucial for tasks such as microarray data analysis due to very small sample sizes resulting in limited empirical evaluation. To the best of our knowledge, such algorithms that give theoretical bounds on the future performance have not been proposed so far in the context of the classification of gene expression data. In this work, we investigate the premise of learning a conjunction (or disjunction) of decision stumps in Occam's Razor, Sample Compression, and PAC-Bayes learning settings for identifying a small subset of attributes that can be used to perform reliable classification tasks. We apply the proposed approaches for gene identification from DNA microarray data and compare our results to those of well known successful approaches proposed for the task. We show that our algorithm not only finds hypotheses with much smaller number of genes while giving competitive classification accuracy but also have tight risk guarantees on future performance unlike other approaches. The proposed approaches are general and extensible in terms of both designing novel algorithms and application to other domains.

bioinformatics, classifier, machine learning, (20 more...)

arXiv.org Artificial Intelligence

1005.053

Country:

North America > United States (0.46)
North America > Canada > Quebec (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.68)
Health & Medicine > Therapeutic Area > Hematology (0.68)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
(2 more...)

Add feedback

When Policies Can Be Trusted: Analyzing a Criteria to Identify Optimal Policies in MDPs with Unknown Model Parameters

Brunskill, Emma (University of California, Berkeley)

AAAI ConferencesMay-1-2010

Computing a good policy in stochastic uncertain environments with unknown dynamics and reward model parameters is a challenging task. In a number of domains, ranging from space robotics to epilepsy management, it may be possible to have an initial training period when suboptimal performance is permitted. For such problems it is important to be able to identify when this training period is complete, and the computed policy can be used with high confidence in its future performance. A simple principled criteria for identifying when training has completed is when the error bounds on the value estimates of the current policy are sufficiently small that the optimal policy is fixed, with high probability. We present an upper bound on the amount of training data required to identify the optimal policy as a function of the unknown separation gap between the optimal and the next-best policy values. We illustrate with several small problems that by estimating this gap in an online manner, the number of training samples to provably reach optimality can be significantly lower than predicted offline using a Probably Approximately Correct framework that requires an input epsilon parameter.

optimal policy, probability, state-action pair, (13 more...)

AAAI Conferences

Twentieth International Conference on Automated Planning and Scheduling

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Industry: Health & Medicine (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Add feedback

Inﬂuence-Based Policy Abstraction for Weakly-Coupled Dec-POMDPs

Witwicki, Stefan John (University of Michigan) | Durfee, Edmund Howell (University of Michigan)

AAAI ConferencesMay-1-2010

Decentralized POMDPs are powerful theoretical models for coordinating agents’ decisions in uncertain environments, but the generally-intractable complexity of optimal joint policy construction presents a signiﬁcant obstacle in applying Dec-POMDPs to problems where many agents face many policy choices. Here, we argue that when most agent choices are independent of other agents’ choices, much of this complexity can be avoided: instead of coordinating full policies, agents need only coordinate policy abstractions that explicitly convey the essential interaction inﬂuences. To this end, we develop a novel framework for inﬂuence-based policy abstraction for weakly-coupled transition-dependent Dec-POMDP problems that subsumes several existing approaches. In addition to formally characterizing the space of transition-dependent inﬂuences, we provide a method for computing optimal and approximately-optimal joint policies. We present an initial empirical analysis, over problems with commonly-studied ﬂavors of transition-dependent inﬂuences, that demonstrates the potential computational beneﬁts of inﬂuence-based abstraction over state-of-the-art optimal policy search methods.

agent, interaction, probability, (17 more...)

AAAI Conferences

Twentieth International Conference on Automated Planning and Scheduling

Country: North America > United States > Michigan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Planning for Concurrent Action Executions Under Action Duration Uncertainty Using Dynamically Generated Bayesian Networks

Beaudry, Eric (Universite de Sherbrooke) | Kabanza, Froduald (Universite de Sherbrooke) | Michaud, Francois (Universite de Sherbrooke)

AAAI ConferencesMay-1-2010

An interesting class of planning domains, including planning for daily activities of Mars rovers, involves achievement of goals with time constraints and concurrent actions with probabilistic durations. Current probabilistic approaches, which rely on a discrete time model, introduce a blow up in the search state-space when the two factors of action concurrency and action duration uncertainty are combined. Simulation-based and sampling probabilistic planning approaches would cope with this state explosion by avoiding storing all the explored states in memory, but they remain approximate solution approaches. In this paper, we present an alternative approach relying on a continuous time model which avoids the state explosion caused by time stamping in the presence of action concurrency and action duration uncertainty. Time is represented as a continuous random variable. The dependency between state time variables is conveyed by a Bayesian network, which is dynamically generated by a state-based forward-chaining search based on the action descriptions. A generated plan is characterized by a probability of satisfying a goal. The evaluation of this probability is done by making a query the Bayesian network.

bayesian network, random variable, time random variable, (16 more...)

AAAI Conferences

Twentieth International Conference on Automated Planning and Scheduling

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > Canada > Quebec > Estrie Region > Sherbrooke (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.84)

Add feedback

Joint Structured Models for Extraction from Overlapping Sources

Gupta, Rahul, Sarawagi, Sunita

arXiv.org Artificial IntelligenceMay-1-2010

We consider the problem of jointly training structured models for extraction from sources whose instances enjoy partial overlap. This has important applications like user-driven ad-hoc information extraction on the web. Such applications present new challenges in terms of the number of sources and their arbitrary pattern of overlap not seen by earlier collective training schemes applied on two sources. We present an agreement-based learning framework and alternatives within it to trade-off tractability, robustness to noise, and extent of agreement. We provide a principled scheme to discover low-noise agreement sets in unlabeled data across the sources. Through extensive experiments over 58 real datasets, we establish that our method of additively rewarding agreement over maximal segments of text provides the best trade-offs, and also scores over alternatives such as collective inference, staged training, and multi-view learning.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/1935826.1935868

1005.0104

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

Quantum learning: optimal classification of qubit states

Guta, Madalin, Kotlowski, Wojciech

arXiv.org Machine LearningApr-14-2010

Pattern recognition is a central topic in Learning Theory with numerous applications such as voice and text recognition, image analysis, computer diagnosis. The statistical set-up in classification is the following: we are given an i.i.d. training set $(X_{1},Y_{1}),... (X_{n},Y_{n})$ where $X_{i}$ represents a feature and $Y_{i}\in \{0,1\}$ is a label attached to that feature. The underlying joint distribution of $(X,Y)$ is unknown, but we can learn about it from the training set and we aim at devising low error classifiers $f:X\to Y$ used to predict the label of new incoming features. Here we solve a quantum analogue of this problem, namely the classification of two arbitrary unknown qubit states. Given a number of `training' copies from each of the states, we would like to `learn' about them by performing a measurement on the training set. The outcome is then used to design mesurements for the classification of future systems with unknown labels. We find the asymptotically optimal classification strategy and show that typically, it performs strictly better than a plug-in strategy based on state estimation. The figure of merit is the excess risk which is the difference between the probability of error and the probability of error of the optimal measurement when the states are known, that is the Helstrom measurement. We show that the excess risk has rate $n^{-1}$ and compute the exact constant of the rate.

classification, machine learning, pattern recognition, (18 more...)

arXiv.org Machine Learning

doi: 10.1088/1367-2630/12/12/123032

1004.2468

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.54)

Add feedback