AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Optimally Solving Dec-POMDPs as Continuous-State MDPs

Dibangoye, Jilles Steeve (INRIA) | Amato, Christopher (Massachusetts Institute of Technology) | Buffet, Olivier (INRIA) | Charpillet, Francois (INRIA)

AAAI ConferencesAug-3-2013

Optimally solving decentralized partially observable Markov decision processes (Dec-POMDPs) is a hard combinatorial problem. Current algorithms search through the space of full histories for each agent. Because of the doubly exponential growth in the number of policies in this space as the planning horizon increases, these methods quickly become intractable. However, in real world problems, computing policies over the full history space is often unnecessary. True histories experienced by the agents often lie near a structured, low-dimensional manifold embedded into the history space. We show that by transforming a Dec-POMDP into a continuous-state MDP, we are able to find and exploit these low-dimensional representations. Using this novel transformation, we can then apply powerful techniques for solving POMDPs and continuous-state MDPs. By combining a general search algorithm and dimension reduction based on feature selection, we introduce a novel approach to optimally solve problems with significantly longer planning horizons than previous methods.

continuous-state mdp, dec-pomdp

AAAI Conferences

Twenty-Third International Joint Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Conditional Restricted Boltzmann Machines for Negotiations in Highly Competitive and Complex Domains

Chen, Siqi (Maastricht University) | Ammar, Haitham Bou (Maastricht University) | Tuyls, Karl (Maastricht University) | Weiss, Gerhard (Maastricht University)

AAAI ConferencesAug-3-2013

Learning in automated negotiations, while useful, is hard because of the indirect way the target function can be observed and the limited amount of experience available to learn from. This paper proposes two novel opponent modeling techniques based on deep learning methods. Moreover, to improve the learning efficacy of negotiating agents, the second approach is also capable of transferring knowledge efficiently between negotiation tasks. Transfer is conducted by automatically mapping the source knowledge to the target in a rich feature space. Experiments show that using these techniques the proposed strategies outperform existing state-of-the-art agents in highly competitive and complex negotiation domains. Furthermore, the empirical game theoretic analysis reveals the robustness of the proposed strategies.

competitive and complex domain, conditional restricted boltzmann machine, negotiation

AAAI Conferences

Twenty-Third International Joint Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Add feedback

An efficient model-free estimation of multiclass conditional probability

Xu, Tu, Wang, Junhui

arXiv.org Machine LearningAug-1-2013

Conventional multiclass conditional probability estimation methods, such as Fisher's discriminate analysis and logistic regression, often require restrictive distributional model assumption. In this paper, a model-free estimation method is proposed to estimate multiclass conditional probability through a series of conditional quantile regression functions. Specifically, the conditional class probability is formulated as difference of corresponding cumulative distribution functions, where the cumulative distribution functions can be converted from the estimated conditional quantile regression functions. The proposed estimation method is also efficient as its computation cost does not increase exponentially with the number of classes. The theoretical and numerical studies demonstrate that the proposed estimation method is highly competitive against the existing competitors, especially when the number of classes is relatively large.

artificial intelligence, estimation method, machine learning, (17 more...)

arXiv.org Machine Learning

1209.4951

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.75)

Add feedback

Scoring and Searching over Bayesian Networks with Causal and Associative Priors

Borboudakis, Giorgos, Tsamardinos, Ioannis

arXiv.org Artificial IntelligenceJul-31-2013

A significant theoretical advantage of search-and-score methods for learning Bayesian Networks is that they can accept informative prior beliefs for each possible network, thus complementing the data. In this paper, a method is presented for assigning priors based on beliefs on the presence or absence of certain paths in the true network. Such beliefs correspond to knowledge about the possible causal and associative relations between pairs of variables. This type of knowledge naturally arises from prior experimental and observational data, among others. In addition, a novel search-operator is proposed to take advantage of such prior knowledge. Experiments show that, using path beliefs improves the learning of the skeleton, as well as the edge directions in the network.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1209.6561

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Likelihood-ratio calibration using prior-weighted proper scoring rules

Brümmer, Niko, Doddington, George

arXiv.org Machine LearningJul-30-2013

Prior-weighted logistic regression has become a standard tool for calibration in speaker recognition. Logistic regression is the optimization of the expected value of the logarithmic scoring rule. We generalize this via a parametric family of proper scoring rules. Our theoretical analysis shows how different members of this family induce different relative weightings over a spectrum of applications of which the decision thresholds range from low to high. Special attention is given to the interaction between prior weighting and proper scoring rule parameters. Experiments on NIST SRE'12 suggest that for applications with low false-alarm rate requirements, scoring rules tailored to emphasize higher score thresholds may give better accuracy than logistic regression.

application, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1307.7981

Country:

North America > United States (0.28)
North America > Canada (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.30)

Add feedback

POMDPs Make Better Hackers: Accounting for Uncertainty in Penetration Testing

Sarraute, Carlos, Buffet, Olivier, Hoffmann, Joerg

arXiv.org Artificial IntelligenceJul-30-2013

Penetration Testing is a methodology for assessing network security, by generating and executing possible hacking attacks. Doing so automatically allows for regular and systematic testing. A key question is how to generate the attacks. This is naturally formulated as planning under uncertainty, i.e., under incomplete knowledge about the network configuration. Previous work uses classical planning, and requires costly pre-processes reducing this uncertainty by extensive application of scanning methods. By contrast, we herein model the attack planning problem in terms of partially observable Markov decision processes (POMDP). This allows to reason about the knowledge available, and to intelligently employ scanning actions as part of the attack. As one would expect, this accurate solution does not scale. We devise a method that relies on POMDPs to find good attacks on individual machines, which are then composed into an attack on the network as a whole. This decomposition exploits network structure to the extent possible, making targeted approximations (only) where needed. Evaluating this method on a suitably adapted industrial test suite, we demonstrate its effectiveness in both runtime and solution quality.

artificial intelligence, firewall, machine learning, (19 more...)

arXiv.org Artificial Intelligence

1307.8182

Country: Europe > Germany (0.28)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Infinite Mixtures of Multivariate Gaussian Processes

Sun, Shiliang

arXiv.org Machine LearningJul-26-2013

This paper presents a new model called infinite mixtures of multivariate Gaussian processes, which can be used to learn vector-valued functions and applied to multitask learning. As an extension of the single multivariate Gaussian process, the mixture model has the advantages of modeling multimodal data and alleviating the computationally cubic complexity of the multivariate Gaussian process. A Dirichlet process prior is adopted to allow the (possibly infinite) number of mixture components to be automatically inferred from training data, and Markov chain Monte Carlo sampling techniques are used for parameter and latent variable inference. Preliminary experimental results on multivariate regression show the feasibility of the proposed model.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

arXiv.org Machine Learning

1307.7028

Country:

North America > Canada > Ontario > Toronto (0.15)
Asia > China (0.14)
North America > United States (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Scaling the Indian Buffet Process via Submodular Maximization

Reed, Colorado, Ghahramani, Zoubin

arXiv.org Machine LearningJul-24-2013

Inference for latent feature models is inherently difficult as the inference space grows exponentially with the size of the input data and number of latent features. In this work, we use Kurihara & Welling (2008)'s maximization-expectation framework to perform approximate MAP inference for linear-Gaussian latent feature models with an Indian Buffet Process (IBP) prior. This formulation yields a submodular function of the features that corresponds to a lower bound on the model evidence. By adding a constant to this function, we obtain a nonnegative submodular function that can be maximized via a greedy algorithm that obtains at least a one-third approximation to the optimal solution. Our inference method scales linearly with the size of the input data, and we show the efficacy of our method on the largest datasets currently analyzed using an IBP model.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1304.3285

Country:

North America > United States (0.68)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Add feedback

Topic Segmentation and Labeling in Asynchronous Conversations

Joty, S., Carenini, G., Ng, R. T.

Journal of Artificial Intelligence ResearchJul-22-2013

Topic segmentation and labeling is often considered a prerequisite for higher-level conversation analysis and has been shown to be useful in many Natural Language Processing (NLP) applications. We present two new corpora of email and blog conversations annotated with topics, and evaluate annotator reliability for the segmentation and labeling tasks in these asynchronous conversations. We propose a complete computational framework for topic segmentation and labeling in asynchronous conversations. Our approach extends state-of-the-art methods by considering a fine-grained structure of an asynchronous conversation, along with other conversational features by applying recent graph-based methods for NLP. For topic segmentation, we propose two novel unsupervised models that exploit the fine-grained conversational structure, and a novel graph-theoretic supervised model that combines lexical, conversational and topic features. For topic labeling, we propose two novel (unsupervised) random walk models that respectively capture conversation specific clues from two different sources: the leading sentences and the fine-grained conversational structure. Empirical evaluation shows that the segmentation and the labeling performed by our best models beat the state-of-the-art, and are highly correlated with human annotations.

annotation, asynchronous conversation, segmentation, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3940

AI Access Foundation

10826

Journal of Artificial Intelligence Research

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)
(29 more...)

Genre:

Research Report > New Finding (1.00)
Workflow (0.92)
Research Report > Promising Solution (0.87)
Research Report > Experimental Study (0.68)

Industry:

Leisure & Entertainment > Games (0.46)
Media > News (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.67)
(2 more...)

Add feedback

Bayesian inference for logistic models using Polya-Gamma latent variables

Polson, Nicholas G., Scott, James G., Windle, Jesse

arXiv.org Machine LearningJul-22-2013

We propose a new data-augmentation strategy for fully Bayesian inference in models with binomial likelihoods. The approach appeals to a new class of Polya-Gamma distributions, which are constructed in detail. A variety of examples are presented to show the versatility of the method, including logistic regression, negative binomial regression, nonlinear mixed-effects models, and spatial models for count data. In each case, our data-augmentation strategy leads to simple, effective methods for posterior inference that: (1) circumvent the need for analytic approximations, numerical integration, or Metropolis-Hastings; and (2) outperform other known data-augmentation strategies, both in ease of use and in computational efficiency. All methods, including an efficient sampler for the Polya-Gamma distribution, are implemented in the R package BayesLogit. In the technical supplement appended to the end of the paper, we provide further details regarding the generation of Polya-Gamma random variables; the empirical benchmarks reported in the main manuscript; and the extension of the basic data-augmentation framework to contingency tables and multinomial outcomes.

artificial intelligence, machine learning, sampler, (16 more...)

arXiv.org Machine Learning

1205.031

Country: North America > United States (0.67)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Add feedback