AITopics | Directed Networks

Collaborating Authors

Directed Networks

News Overviews Instructional Materials AI-Alerts Classics

Bayesian Hierarchical Reinforcement Learning

Neural Information Processing SystemsDec-31-2012

We describe an approach to incorporating Bayesian priors in the maxq framework for hierarchical reinforcement learning (HRL). We define priors on the primitive environment model and on task pseudo-rewards. Since models for composite tasks can be complex, we use a mixed model-based/model-free learning approach to find an optimal hierarchical policy. We show empirically that (i) our approach results in improved convergence over non-Bayesian baselines, given sensible priors, (ii) task hierarchies and Bayesian priors can be complementary sources of information, and using both sources is better than either alone, (iii) taking advantage of the structural decomposition induced by the task hierarchy significantly reduces the computational cost of Bayesian reinforcement learning and (iv) in this framework, task pseudo-rewards can be learned instead of being manually specified, leading to automatic learning of hierarchically optimal rather than recursively optimal policies.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.97)

Add feedback

Nonparametric Max-Margin Matrix Factorization for Collaborative Prediction

Xu, Minjie, Zhu, Jun, Zhang, Bo

Neural Information Processing SystemsDec-31-2012

We present a probabilistic formulation of max-margin matrix factorization and build accordingly a nonparametric Bayesian model which automatically resolves the unknown number of latent factors. Our work demonstrates a successful example thatintegrates Bayesian nonparametrics and max-margin learning, which are conventionally two separate paradigms and enjoy complementary advantages. We develop an efficient variational algorithm for posterior inference, and our extensive empiricalstudies on large-scale MovieLens and EachMovie data sets appear to justify the aforementioned dual advantages.

artificial intelligence, machine learning, matrix factorization, (14 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Bethe Bounds and Approximating the Global Optimum

Weller, Adrian, Jebara, Tony

arXiv.org Machine LearningDec-31-2012

Inference in general Markov random fields (MRFs) is NP-hard, though identifying the maximum a posteriori (MAP) configuration of pairwise MRFs with submodular cost functions is efficiently solvable using graph cuts. Marginal inference, however, even for this restricted class, is in #P. We prove new formulations of derivatives of the Bethe free energy, provide bounds on the derivatives and bracket the locations of stationary points, introducing a new technique called Bethe bound propagation. Several results apply to pairwise models whether associative or not. Applying these to discretized pseudo-marginals in the associative case we present a polynomial time approximation scheme for global optimization provided the maximum degree is $O(\log n)$, and discuss several extensions.

artificial intelligence, bayesian inference, machine learning, (13 more...)

arXiv.org Machine Learning

1301.0015

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

Learning to Predict from Textual Data

Radinsky, K., Davidovich, S., Markovitch, S.

Journal of Artificial Intelligence ResearchDec-26-2012

Given a current news event, we tackle the problem of generating plausible predictions of future events it might cause. We present a new methodology for modeling and predicting such future news events using machine learning and data mining techniques. Our Pundit algorithm generalizes examples of causality pairs to infer a causality predictor. To obtain precisely labeled causality examples, we mine 150 years of news articles and apply semantic natural language modeling techniques to headlines containing certain predefined causality patterns. For generalization, the model uses a vast number of world knowledge ontologies. Empirical evaluation on real news articles shows that our Pundit algorithm performs as well as non-expert humans.

algorithm, prediction, proceedings, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3865

AI Access Foundation

10792

Journal of Artificial Intelligence Research

Country:

Europe > Germany (0.14)
Asia > Middle East > Iraq > Baghdad Governorate > Baghdad (0.04)
North America > United States > Texas (0.04)
(31 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Military (0.68)
Government > Regional Government > North America Government > United States Government (0.68)
(3 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
(8 more...)

Add feedback

Mixtures of Shifted Asymmetric Laplace Distributions

Franczak, Brian C., Browne, Ryan P., McNicholas, Paul D.

arXiv.org Machine LearningDec-21-2012

A mixture of shifted asymmetric Laplace distributions is introduced and used for clustering and classification. A variant of the EM algorithm is developed for parameter estimation by exploiting the relationship with the general inverse Gaussian distribution. This approach is mathematically elegant and relatively computationally straightforward. Our novel mixture modelling approach is demonstrated on both simulated and real data to illustrate clustering and classification applications. In these analyses, our mixture of shifted asymmetric Laplace distributions performs favourably when compared to the popular Gaussian approach. This work, which marks an important step in the non-Gaussian model-based clustering and classification direction, concludes with discussion as well as suggestions for future work.

artificial intelligence, machine learning, mixture model, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/TPAMI.2013.216

1207.1727

Country:

North America > United States (0.68)
Europe (0.68)
North America > Canada > Ontario (0.46)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

An Experiment with Hierarchical Bayesian Record Linkage

Larsen, Michael D.

arXiv.org Machine LearningDec-20-2012

In record linkage (RL), or exact file matching, the goal is to identify the links between entities with information on two or more files. RL is an important activity in areas including counting the population, enhancing survey frames and data, and conducting epidemiological and follow-up studies. RL is challenging when files are very large, no accurate personal identification (ID) number is present on all files for all units, and some information is recorded with error. Without an unique ID number one must rely on comparisons of names, addresses, dates, and other information to find the links. Latent class models can be used to automatically score the value of information for determining match status. Data for fitting models come from comparisons made within groups of units that pass initial file blocking requirements. Data distributions can vary across blocks. This article examines the use of prior information and hierarchical latent class models in the context of RL.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1212.5203

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (0.46)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Practical Algorithm for Topic Modeling with Provable Guarantees

Arora, Sanjeev, Ge, Rong, Halpern, Yoni, Mimno, David, Moitra, Ankur, Sontag, David, Wu, Yichen, Zhu, Michael

arXiv.org Machine LearningDec-19-2012

Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model inference have been based on a maximum likelihood objective. Efficient algorithms exist that approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for topic model inference that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

1212.4777

Country:

North America > United States (1.00)
Asia (1.00)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Asia Government (1.00)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.76)

Add feedback

Probability Bracket Notation: Markov State Chain Projector, Hidden Markov Models and Dynamic Bayesian Networks

Wang, Xing M.

arXiv.org Artificial IntelligenceDec-16-2012

The Weather-Stone Example and the Elvira Software page 21 5. VMM, HMM and FHMM as Dynamic Bayesian Networks page 23 Summary page 26 References page 27 Abstract After a brief discussion of Markov Evolution Formula (MEF) expressed in Probability Bracket Notation (PBN), its close relation with the joint probability distribution (JPD) of Visible Markov Models (VMM) is demonstrated by introducing Markov State Chain Projector (MSCP). The state basis and the observed basis are defined in the Sequential Event Space (SES) of Hidden Markov Models (HMM). The JPD of HMM is derived by using basis transformation in SES. The Viterbi algorithm is revisited and applied to the famous Weather HMM example, whose node graph and inference results are displayed by using software package Elvira. In the end, the formulas of VMM, HMM and some factorial HMM (FHMM) are expressed in PBN as instances of dynamic Bayesian Networks (DBN). Dr. Xing M Wang PBN, Markov Time Evolution & HMM Page 1 of 27 2012-12-16 1. Introduction: PBN and Discrete Markov Chain Inspired by the great success of Dirac notation, we have proposed Probability Bracket Notation (PBN) [1], where we have used PBN to discuss Markov chains (see [2] Chap.11). Based on our main topic of this article, we will concentrate on homogeneous, time-discrete first-order Markov chains with finite discrete states.

artificial intelligence, machine learning, markov time evolution, (15 more...)

arXiv.org Artificial Intelligence

1212.3817

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

MAP Complexity Results and Approximation Methods

Park, James D.

arXiv.org Artificial IntelligenceDec-12-2012

MAP is the problem of finding a most probable instantiation of a set of nvariables in a Bayesian network, given some evidence. MAP appears to be a significantly harder problem than the related problems of computing the probability of evidence Pr, or MPE a special case of MAP. Because of the complexity of MAP, and the lack of viable algorithms to approximate it,MAP computations are generally avoided by practitioners. This paper investigates the complexity of MAP. We show that MAP is complete for NP. We also provide negative complexity results for elimination based algorithms. It turns out that MAP remains hard even when MPE, and Pr are easy. We show that MAP is NPcomplete when the networks are restricted to polytrees, and even then can not be effectively approximated. Because there is no approximation algorithm with guaranteed results, we investigate best effort approximations. We introduce a generic MAP approximation framework. As one instantiation of it, we implement local search coupled with belief propagation BP to approximate MAP. We show how to extract approximate evidence retraction information from belief propagation which allows us to perform efficient local search. This allows MAP approximation even on networks that are too complex to even exactly solve the easier problems of computing Pr or MPE. Experimental results indicate that using BP and local search provides accurate MAP estimates in many cases.

bayesian inference, instantiation, optimization problem, (20 more...)

arXiv.org Artificial Intelligence

1301.0592

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report (0.90)

Industry: Energy > Oil & Gas (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Finding Optimal Bayesian Networks

Chickering, David Maxwell, Meek, Christopher

arXiv.org Artificial IntelligenceDec-12-2012

In this paper, we derive optimality results for greedy Bayesian-network search algorithms that perform single-edge modifications at each step and use asymptotically consistent scoring criteria. Our results extend those of Meek (1997) and Chickering (2002), who demonstrate that in the limit of large datasets, if the generative distribution is perfect with respect to a DAG defined over the observable variables, such search algorithms will identify this optimal (i.e. We relax their assumption about the generative distribution, and assume only that this distribution satisfies the composition property over the observable variables, which is a more realistic assumption for real domains. Under this assumption, we guarantee that the search algorithms identify an inclusion-optimal model; that is, a model that (1) contains the generative distribution and (2) has no sub-model that contains this distribution. In addition, we show that the composition property is guaranteed to hold whenever the dependence relationships in the generative distribution can be characterized by paths between singleton elements in some generative graphical model (e.g. a DAG, a chain graph, or a Markov network) even when the generative model includes unobserved variables, and even when the observed data is subject to selection bias. Introduction The problem of learning Bayesian networks (a.k.a directed graphical models) from data has received much attention in the UAI community. A simple approach taken by many researchers, particularly those contributing experimental papers, is to apply--in conjunction with a scoring criterion--a greedy single-edge search algorithm to the space of Bayesian-network structures or to the space of equivalence classes of those structures. There are a number of important reasons for the popularity of this approach.

algorithm, criterion, generative distribution, (15 more...)

arXiv.org Artificial Intelligence

1301.0561

Country:

North America > United States > Washington > King County > Redmond (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback