AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Approximate Planning for Factored POMDPs

Feng, Zhengzhu (University of Massachusetts, Amherst) | Hansen, Eric A. (Mississippi State University)

AAAI ConferencesJun-9-2014

We describe an approximate dynamic programming algorithm for partially observable Markov decision processes represented in factored form. Two complementary forms of approximation are used to simplify a piecewise linear and convex value function, where each linear facet of the function is represented compactly by an algebraic decision diagram. ln one form of approximation, the degree of state abstraction is increased by aggregating states with similar values. In the second form of approximation, the value function is simplified by removing linear facets that contribute marginally to value. We derive an error bound that applies to both forms of approximation. Experimental results show that this approach improves the performance of dynamic programming and extends the range of problems it can solve.

approximate planning, factored pomdp

AAAI Conferences

Sixth European Conference on Planning

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Learning directed acyclic graphs via bootstrap aggregating

Wang, Ru, Peng, Jie

arXiv.org Machine LearningJun-9-2014

Probabilistic graphical models are graphical representations of probability distributions. Graphical models have applications in many fields including biology, social sciences, linguistic, neuroscience. In this paper, we propose directed acyclic graphs (DAGs) learning via bootstrap aggregating. The proposed procedure is named as DAGBag. Specifically, an ensemble of DAGs is first learned based on bootstrap resamples of the data and then an aggregated DAG is derived by minimizing the overall distance to the entire ensemble. A family of metrics based on the structural hamming distance is defined for the space of DAGs (of a given node set) and is used for aggregation. Under the high-dimensional-low-sample size setting, the graph learned on one data set often has excessive number of false positive edges due to over-fitting of the noise. Aggregation overcomes over-fitting through variance reduction and thus greatly reduces false positives. We also develop an efficient implementation of the hill climbing search algorithm of DAG learning which makes the proposed method computationally competitive for the high-dimensional regime. The DAGBag procedure is implemented in the R package dagbag.

artificial intelligence, gshd, machine learning, (17 more...)

arXiv.org Machine Learning

1406.2098

Country: North America > United States > California (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)
(2 more...)

Add feedback

ExpertBayes: Automatically refining manually built Bayesian networks

Almeida, Ezilda, Ferreira, Pedro, Vinhoza, Tiago, Dutra, Inês, Li, Jingwei, Wu, Yirong, Burnside, Elizabeth

arXiv.org Machine LearningJun-9-2014

Bayesian network structures are usually built using only the data and starting from an empty network or from a naive Bayes structure. Very often, in some domains, like medicine, a prior structure knowledge is already known. This structure can be automatically or manually refined in search for better performance models. In this work, we take Bayesian networks built by specialists and show that minor perturbations to this original network can yield better classifiers with a very small computational cost, while maintaining most of the intended meaning of the original model.

artificial intelligence, expertbaye, machine learning, (18 more...)

arXiv.org Machine Learning

1406.2395

Country: North America > United States > Virginia (0.28)

Genre: Research Report > Experimental Study (0.48)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Compressed Gaussian Process

Guhaniyogi, Rajarshi, Dunson, David B.

arXiv.org Machine LearningJun-7-2014

Nonparametric regression for massive numbers of samples (n) and features (p) is an increasingly important problem. In big n settings, a common strategy is to partition the feature space, and then separately apply simple models to each partition set. We propose an alternative approach, which avoids such partitioning and the associated sensitivity to neighborhood choice and distance metrics, by using random compression combined with Gaussian process regression. The proposed approach is particularly motivated by the setting in which the response is conditionally independent of the features given the projection to a low dimensional manifold. Conditionally on the random compression matrix and a smoothness parameter, the posterior distribution for the regression surface and posterior predictive distributions are available analytically. Running the analysis in parallel for many random compression matrices and smoothness parameters, model averaging is used to combine the results. The algorithm can be implemented rapidly even in very big n and p problems, has strong theoretical justification, and is found to yield state of the art predictive performance.

data mining, machine learning, predictive interval, (22 more...)

arXiv.org Machine Learning

1406.1916

Genre: Research Report (0.82)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(3 more...)

Add feedback

Advances in Learning Bayesian Networks of Bounded Treewidth

Nie, Siqi, Maua, Denis Deratani, de Campos, Cassio Polpo, Ji, Qiang

arXiv.org Machine LearningJun-6-2014

This work presents novel algorithms for learning Bayesian network structures with bounded treewidth. Both exact and approximate methods are developed. The exact method combines mixed-integer linear programming formulations for structure learning and treewidth computation. The approximate method consists in uniformly sampling $k$-trees (maximal graphs of treewidth $k$), and subsequently selecting, exactly or approximately, the best structure whose moral graph is a subgraph of that $k$-tree. Some properties of these methods are discussed and proven. The approaches are empirically compared to each other and to a state-of-the-art method for learning bounded treewidth structures on a collection of public data sets with up to 100 variables. The experiments show that our exact algorithm outperforms the state of the art, and that the approximate approach is fairly accurate.

artificial intelligence, machine learning, treewidth, (16 more...)

arXiv.org Machine Learning

1406.1411

Country: Europe (0.46)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Neural Variational Inference and Learning in Belief Networks

Mnih, Andriy, Gregor, Karol

arXiv.org Machine LearningJun-4-2014

Highly expressive directed latent variable models, such as sigmoid belief networks, are difficult to train on large datasets because exact inference in them is intractable and none of the approximate inference methods that have been applied to them scale well. We propose a fast non-iterative approximate inference method that uses a feedforward network to implement efficient exact sampling from the variational posterior. The model and this inference network are trained jointly by maximizing a variational lower bound on the log-likelihood. Although the naive estimator of the inference network gradient is too high-variance to be useful, we make it practical by applying several straightforward modelindependent variance reduction techniques. Applying our approach to training sigmoid belief networks and deep autoregressive networks, we show that it outperforms the wake-sleep algorithm on MNIST and achieves state-of-the-art results on the Reuters RCV1 document dataset.

artificial intelligence, deep learning, machine learning, (13 more...)

arXiv.org Machine Learning

1402.003

Country: Asia (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Learning Latent Block Structure in Weighted Networks

Aicher, Christopher, Jacobs, Abigail Z., Clauset, Aaron

arXiv.org Machine LearningJun-3-2014

Networks are an increasingly important form of structured data consisting of interactions between pairs of individuals in large social and biological data sets. Unlike attribute data where each observation is associated with an individual, network data is represented by graphs, where individuals are vertices and interactions are edges. Because vertices are pairwise related, network data violates traditional assumptions of attribute data, such as independence. This intrinsic difference in structure prompts the development of new tools for handling network data. In social and biological networks, vertices often play distinct structural roles in generating the network's large-scale structure. To identify such latent structural roles, we aim to identify a network partition that groups together vertices with similar group-level connectivity patterns. We call these groups "communities," and their inference produces a compact description of the large-scale 1 (a) Assortative (b) Disassortative (c) Core-Periphery (d) Ordered Figure 1: Examples of structure that can be learned using the SBM. The first row shows the abstract connections between four groups (blue, red, green, and purple). The second row shows the'block' structure found in the adjacency matrix after sorting by group membership; black corresponds to edges and white corresponds to non-edges.

artificial intelligence, information, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1093/comnet/cnu026

1404.0431

Country: North America > United States > Colorado > Boulder County > Boulder (0.14)

Genre: Research Report (0.81)

Industry:

Information Technology (0.74)
Government > Regional Government > North America Government > United States Government (0.67)
Leisure & Entertainment > Sports > Football (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Data Science (0.93)
(2 more...)

Add feedback

Topological and Statistical Behavior Classifiers for Tracking Applications

Bendich, Paul, Chin, Sang, Clarke, Jesse, deSena, Jonathan, Harer, John, Munch, Elizabeth, Newman, Andrew, Porter, David, Rouse, David, Strawn, Nate, Watkins, Adam

arXiv.org Machine LearningJun-1-2014

We introduce the first unified theory for target tracking using Multiple Hypothesis Tracking, Topological Data Analysis, and machine learning. Our string of innovations are 1) robust topological features are used to encode behavioral information, 2) statistical models are fitted to distributions over these topological features, and 3) the target type classification methods of Wigren and Bar Shalom et al. are employed to exploit the resulting likelihoods for topological features inside of the tracking procedure. To demonstrate the efficacy of our approach, we test our procedure on synthetic vehicular data generated by the Simulation of Urban Mobility package.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1406.0214

Country: North America > United States (0.67)

Genre: Research Report (0.51)

Industry: Government > Military (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Inference of Sparse Networks with Unobserved Variables. Application to Gene Regulatory Networks

Slavov, Nikolai

arXiv.org Machine LearningJun-1-2014

Networks are a unifying framework for modeling complex systems and network inference problems are frequently encountered in many fields. Here, I develop and apply a generative approach to network inference (RCweb) for the case when the network is sparse and the latent (not observed) variables affect the observed ones. From all possible factor analysis (FA) decompositions explaining the variance in the data, RCweb selects the FA decomposition that is consistent with a sparse underlying network. The sparsity constraint is imposed by a novel method that significantly outperforms (in terms of accuracy, robustness to noise, complexity scaling, and computational efficiency) Bayesian methods and MLE methods using l1 norm relaxation such as K-SVD and l1--based sparse principle component analysis (PCA). Results from simulated models demonstrate that RCweb recovers exactly the model structures for sparsity as low (as non-sparse) as 50% and with ratio of unobserved to observed variables as high as 2. RCweb is robust to noise, with gradual decrease in the parameter ranges as the noise level increases.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

1406.0193

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.70)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

An Efficient Algorithm for Estimating State Sequences in Imprecise Hidden Markov Models

De Bock, J., de Cooman, G.

Journal of Artificial Intelligence ResearchMay-31-2014

We present an efficient exact algorithm for estimating state sequences from outputs or observations in imprecise hidden Markov models (iHMMs). The uncertainty linking one state to the next, and that linking a state to its output, is represented by a set of probability mass functions instead of a single such mass function. We consider as best estimates for state sequences the maximal sequences for the posterior joint state model conditioned on the observed output sequence, associated with a gain function that is the indicator of the state sequence. This corresponds to and generalises finding the state sequence with the highest posterior probability in (precise-probabilistic) HMMs, thereby making our algorithm a generalisation of the one by Viterbi. We argue that the computational complexity of our algorithm is at worst quadratic in the length of the iHMM, cubic in the number of states, and essentially linear in the number of maximal state sequences. An important feature of our imprecise approach is that there may be more than one maximal sequence, typically in those instances where its precise-probabilistic counterpart is sensitive to the choice of prior. For binary iHMMs, we investigate experimentally how the number of maximal state sequences depends on the model parameters. We also present an application in optical character recognition, demonstrating that our algorithm can be usefully applied to robustify the inferences made by its precise-probabilistic counterpart.

algorithm, equation, sequence, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4385

AI Access Foundation

10883

Journal of Artificial Intelligence Research

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
Europe > Czechia > Prague (0.04)
Europe > Belgium > Flanders (0.04)

Genre: Workflow (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback