AITopics | Directed Networks

Collaborating Authors

Directed Networks

News Overviews Instructional Materials AI-Alerts Classics

Variational MCMC

de Freitas, Nando, Hojen-Sorensen, Pedro, Jordan, Michael I., Russell, Stuart

arXiv.org Machine LearningJan-10-2013

We propose a new class of learning algorithms that combines variational approximation and Markov chain Monte Carlo (MCMC) simulation. Naive algorithms that use the variational approximation as proposal distribution can perform poorly because this approximation tends to underestimate the true variance and other features of the data. We solve this problem by introducing more sophisticated MCMC algorithms. One of these algorithms is a mixture of two MCMC kernels: a random walk Metropolis kernel and a blockMetropolis-Hastings (MH) kernel with a variational approximation as proposaldistribution. The MH kernel allows one to locate regions of high probability efficiently. The Metropolis kernel allows us to explore the vicinity of these regions. This algorithm outperforms variationalapproximations because it yields slightly better estimates of the mean and considerably better estimates of higher moments, such as covariances. It also outperforms standard MCMC algorithms because it locates theregions of high probability quickly, thus speeding up convergence. We demonstrate this algorithm on the problem of Bayesian parameter estimation for logistic (sigmoid) belief networks.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1301.2266

Country:

Europe > Denmark (0.28)
North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)

Add feedback

Determinantal point processes for machine learning

Kulesza, Alex, Taskar, Ben

arXiv.org Machine LearningJan-10-2013

Determinantal point processes (DPPs) are elegant probabilistic models of repulsion that arise in quantum physics and random matrix theory. In contrast to traditional structured models like Markov random fields, which become intractable and hard to approximate in the presence of negative correlations, DPPs offer efficient and exact algorithms for sampling, marginalization, conditioning, and other inference tasks. We provide a gentle introduction to DPPs, focusing on the intuitions, algorithms, and extensions that are most relevant to the machine learning community, and show how DPPs can be applied to real-world applications like finding diverse sets of high-quality search results, building informative summaries by selecting diverse sentences from documents, modeling non-overlapping human poses in images or video, and automatically building timelines of important news stories.

information retrieval, machine learning, natural language, (24 more...)

arXiv.org Machine Learning

doi: 10.1561/2200000044

1207.6083

Country:

Europe (0.92)
Asia > Middle East (0.67)
North America > United States > Texas (0.28)
North America > United States > California (0.27)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)
Research Report > Experimental Study (0.45)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
(6 more...)

Add feedback

Using Temporal Data for Making Recommendations

Zimdars, Andrew, Chickering, David Maxwell, Meek, Christopher

arXiv.org Artificial IntelligenceJan-10-2013

We treat collaborative filtering as a univariate time series problem: given a user's previous votes, predict the next vote. We describe two families of methods for transforming data to encode time order in ways amenable to off-the-shelf classification and density estimation tools. Using a decision-tree learning tool and two real-world data sets, we compare the results of these approaches to the results of collaborative filtering without ordering information. The improvements in both predictive accuracy and in recommendation quality that we realize advocate the use of predictive algorithms exploiting the temporal order of data.

artificial intelligence, decision tree learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1301.232

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment (0.48)
Media > Film (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Analysing Sensitivity Data from Probabilistic Networks

van der Gaag, Linda C., Renooij, Silja

arXiv.org Artificial IntelligenceJan-10-2013

With the advance of efficient analytical methods for sensitivity analysis ofprobabilistic networks, the interest in the sensitivities revealed by real-life networks is rekindled. As the amount of data resulting from a sensitivity analysis of even a moderately-sized network is alreadyoverwhelming, methods for extracting relevant information are called for. One such methodis to study the derivative of the sensitivity functions yielded for a network's parameters. We further propose to build upon the concept of admissible deviation, that is, the extent to which a parameter can deviate from the true value without inducing a change in the most likely outcome. We illustrate these concepts by means of a sensitivity analysis of a real-life probabilistic network in oncology.

artificial intelligence, machine learning, sensitivity value, (17 more...)

arXiv.org Artificial Intelligence

1301.2314

Country: Europe (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Bayesian Error-Bars for Belief Net Inference

Van Allen, Tim, Greiner, Russell, Hooper, Peter

arXiv.org Artificial IntelligenceJan-10-2013

A Bayesian Belief Network (BN) is a model of a joint distribution over a setof n variables, with a DAG structure to represent the immediate dependenciesbetween the variables, and a set of parameters (aka CPTables) to represent thelocal conditional probabilities of a node, given each assignment to itsparents. In many situations, these parameters are themselves random variables - this may reflect the uncertainty of the domain expert, or may come from atraining sample used to estimate the parameter values. The distribution overthese "CPtable variables" induces a distribution over the response the BNwill return to any "What is Pr(H | E)?" query. This paper investigates thevariance of this response, showing first that it is asymptotically normal,then providing its mean and asymptotical variance. We then present aneffective general algorithm for computing this variance, which has the samecomplexity as simply computing the (mean value of) the response itself - ie,O(n 2^w), where n is the number of variables and w is the effective treewidth. Finally, we provide empirical evidence that this algorithm, whichincorporates assumptions and approximations, works effectively in practice,given only small samples.

artificial intelligence, machine learning, posterior distribution, (19 more...)

arXiv.org Artificial Intelligence

1301.2313

Country: North America > Canada > Alberta (0.29)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Causal Discovery from Changes

Tian, Jin, Pearl, Judea

arXiv.org Artificial IntelligenceJan-10-2013

We propose a new method of discovering causal structures, based on the detection of local, spontaneous changes in the underlying data-generating model. We analyze the classes of structures that are equivalent relative to a stream of distributions produced by local changes, and devise algorithms that output graphical representations of these equivalence classes. We present experimental results, using simulated data, and examine the errors associated with detection of changes and recovery of structures.

artificial intelligence, causal diagram, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1301.2312

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

Add feedback

Maximum Likelihood Bounded Tree-Width Markov Networks

Srebro, Nathan

arXiv.org Artificial IntelligenceJan-10-2013

Chow and Liu (1968) studied the problem of learning a maximumlikelihood Markov tree. We generalize their work to more complexMarkov networks by considering the problem of learning a maximumlikelihood Markov network of bounded complexity. We discuss howtree-width is in many ways the appropriate measure of complexity andthus analyze the problem of learning a maximum likelihood Markovnetwork of bounded tree-width.Similar to the work of Chow and Liu, we are able to formalize thelearning problem as a combinatorial optimization problem on graphs. Weshow that learning a maximum likelihood Markov network of boundedtree-width is equivalent to finding a maximum weight hypertree. Thisequivalence gives rise to global, integer-programming based,approximation algorithms with provable performance guarantees, for thelearning problem. This contrasts with heuristic local-searchalgorithms which were previously suggested (e.g. by Malvestuto 1991).The equivalence also allows us to study the computational hardness ofthe learning problem. We show that learning a maximum likelihoodMarkov network of bounded tree-width is NP-hard, and discuss thehardness of approximation.

artificial intelligence, machine learning, markov network, (17 more...)

arXiv.org Artificial Intelligence

1301.2311

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Toward General Analysis of Recursive Probability Models

Pless, Daniel, Luger, George

arXiv.org Artificial IntelligenceJan-10-2013

There is increasing interest within the research community in the design and use of recursive probability models. Although there still remains concern about computational complexity costs and the fact that computing exact solutions can be intractable for many nonrecursive models and impossible in the general case for recursive problems, several research groups are actively developing computational techniques for recursive stochastic languages. We have developed an extension to the traditional lambda-calculus as a framework for families of Turing complete stochastic languages. We have also developed a class of exact inference algorithms based on the traditional reductions of the lambda-calculus. We further propose that using the deBruijn notation (a lambda-calculus notation with nameless dummies) supports effective caching in such systems (caching being an essential component of efficient computation). Finally, our extension to the lambda-calculus offers a foundation and general theory for the construction of recursive stochastic modeling languages as well as promise for effective caching and efficient approximation algorithms for inference.

artificial intelligence, logic & formal reasoning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

1301.2302

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Add feedback

Sufficiency, Separability and Temporal Probabilistic Models

Pfeffer, Avi

arXiv.org Artificial IntelligenceJan-10-2013

Suppose we are given the conditional probability of one variable given some other variables.Normally the full joint distribution over the conditioning variablesis required to determine the probability of the conditioned variable.Under what circumstances are the marginal distributions over the conditioning variables sufficient to determine the probability ofthe conditioned variable?Sufficiency in this sense is equivalent to additive separability ofthe conditional probability distribution.Such separability structure is natural and can be exploited forefficient inference.Separability has a natural generalization to conditional separability.Separability provides a precise notion of weaklyinteracting subsystems in temporal probabilistic models.Given a system that is decomposed into separable subsystems, exactmarginal probabilities over subsystems at future points in time can becomputed by propagating marginal subsystem probabilities, rather thancomplete system joint probabilities.Thus, separability can make exact prediction tractable.However, observations can break separability,so exact monitoring of dynamic systems remains hard.

artificial intelligence, machine learning, subsystem, (17 more...)

arXiv.org Artificial Intelligence

1301.2301

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)

Add feedback

Approximating MAP using Local Search

Park, James D., Darwiche, Adnan

arXiv.org Artificial IntelligenceJan-10-2013

MAP is the problem of finding a most probable instantiation of a set of variables in a Bayesian network, given evidence. Unlike computing marginals, posteriors, and MPE (a special case of MAP), the time and space complexity of MAP is not only exponential in the network treewidth, but also in a larger parameter known as the "constrained" treewidth. In practice, this means that computing MAP can be orders of magnitude more expensive than computingposteriors or MPE. Thus, practitioners generally avoid MAP computations, resorting instead to approximating them by the most likely value for each MAP variableseparately, or by MPE.We present a method for approximating MAP using local search. This method has space complexity which is exponential onlyin the treewidth, as is the complexity of each search step. We investigate the effectiveness of different local searchmethods and several initialization strategies and compare them to otherapproximation schemes.Experimental results show that local search provides a much more accurate approximation of MAP, while requiring few search steps.Practically, this means that the complexity of local search is often exponential only in treewidth as opposed to the constrained treewidth, making approximating MAP as efficient as other computations.

artificial intelligence, elimination order, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1301.2299

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)

Add feedback