Undirected Networks
Online Max-Margin Weight Learning with Markov Logic Networks
Huynh, Tuyen N. (The University of Texas at Austin) | Mooney, Raymond J. (The University of Texas at Austin)
Most of the existing weight-learning algorithms for Markov Logic Networks (MLNs) use batch training which becomes computationally expensive and even infeasible for very large datasets since the training examples may not fit in main memory. To overcome this problem, previous work has used online learning algorithms to learn weights for MLNs. However, this prior work has only applied existing online algorithms, and there is no comprehensive study of online weight learning for MLNs. In this paper, we derive new online algorithms for structured prediction using the primal-dual framework, apply them to learn weights for MLNs, and compare against existing online algorithms on two large, real-world datasets. The experimental results show that the new algorithms achieve better accuracy than existing methods.
Relational Learning for Collective Classification of Entities in Images
Chechetka, Anton (Carnegie Mellon University) | Dash, Denver (Intel Labs Pittsburgh) | Philipose, Matthai (Intel Labs Seattle)
We consider the problem of discrete multi-label entity classification in images. We argue that the framework of Markov Logic can provide a unified, well-grounded mechanism to incorporate arbitrary logical relationships between entities to improve classification in images, and thus generalizes much of the recent work on exploiting local and global context in object recognition and scene understanding. Furthermore, we show that Markov Logic can provide a powerful new set of contexts that can relate entities across images in a database for joint classification of all entities in a test set simultaneously. We relate this collective classification of images to graph-based semi-supervised learning approaches, and show that Markov Logic can effectively provide a method to unify context-related work with semi-supervised approaches in a way that neither techniques could easily do on their own. Finally, we show the efficacy of these techniques on a face recognition task on three datasets showing that adding contextual relations dramatically improves accuracy over semi-supervised learning approaches alone.
Handling Looping and Optional Actions in YAPPR
Geib, Christopher (University of Edinburgh) | Goldman, Robert (SIFT LLC)
Previous work on the YAPPR plan recognition system provided algorithms for translating conventional HTN plan libraries into lexicalized grammars and treated the problem of plan recognition as one of parsing. To produce these grammars required a fixed bound for any loops within the grammar and a presented a problem for optional actions within HTN plans. In this work we show that well known transformations from formal language theory can be used to rewrite the plan grammars to remove these limitations on the plan libraries.
Teamwork and Coordination under Model Uncertainty in DEC-POMDPs
Kwak, Jun-young (University of Southern California) | Yang, Rong (University of Southern California) | Yin, Zhengyu (University of Southern California) | Taylor, Matthew E. (University of Southern California) | Tambe, Milind (University of Southern California)
Distributed Partially Observable Markov Decision Processes (DEC-POMDPs) are a popular planning framework for multiagent teamwork to compute (near-)optimal plans. However, these methods assume a complete and correct world model, which is often violated in real-world domains. We provide a new algorithm for DEC-POMDPs that is more robust to model uncertainty, with a focus on domains with sparse agent interactions. Our STC algorithm relies on the following key ideas: (1) reduce planning-time computation by shifting some of the burden to execution-time reasoning, (2) exploit sparse interactions between agents, and (3) maintain an approximate model of agentsโ beliefs. We empirically show that STC is often substantially faster to existing DEC-POMDP methods without sacrificing reward performance.
A Computational Decision Theory for Interactive Assistants
Fern, Alan (Oregon State University) | Tadepalli, Prasad (Oregon State University)
We study several classes of interactive assistants from the points of view of decision theory and computational complexity. We first introduce a special class of POMDPs called hidden-goal MDPs (HGMDPs), which formalize the problem of interactively assisting an agent whose goal is hidden and whose actions are observable. In spite of its restricted nature, we show that optimal action selection in finite horizon HGMDPs is PSPACE-complete even in domains with deterministic dynamics. We then introduce a more restricted model called helper action MDPs (HAMDPs), where the assistantโs action is accepted by the agent when it is helpful, and can be easily ignored by the agent otherwise. We show classes of HAMDPs that are complete for PSPACE and NP along with a polynomial time class. Furthermore, we show that for general HAMDPs a simple myopic policy achieves a regret, compared to an omniscient assistant, that is bounded by the entropy of the initial goal distribution. A variation of this policy is also shown to achieve worst-case regret that is logarithmic in the number of goals for any goal distribution.
Learning to Predict Combinatorial Structures
The major challenge in designing a discriminative learning algorithm for predicting structured data is to address the computational issues arising from the exponential size of the output space. Existing algorithms make different assumptions to ensure efficient, polynomial time estimation of model parameters. For several combinatorial structures, including cycles, partially ordered sets, permutations and other graph classes, these assumptions do not hold. In this thesis, we address the problem of designing learning algorithms for predicting combinatorial structures by introducing two new assumptions: (i) The first assumption is that a particular counting problem can be solved efficiently. The consequence is a generalisation of the classical ridge regression for structured prediction. (ii) The second assumption is that a particular sampling problem can be solved efficiently. The consequence is a new technique for designing and analysing probabilistic structured prediction models. These results can be applied to solve several complex learning problems including but not limited to multi-label classification, multi-category hierarchical classification, and label ranking.
Products of Weighted Logic Programs
Cohen, Shay B., Simmons, Robert J., Smith, Noah A.
Weighted logic programming, a generalization of bottom-up logic programming, is a well-suited framework for specifying dynamic programming algorithms. In this setting, proofs correspond to the algorithm's output space, such as a path through a graph or a grammatical derivation, and are given a real-valued score (often interpreted as a probability) that depends on the real weights of the base axioms used in the proof. The desired output is a function over all possible proofs, such as a sum of scores or an optimal score. We describe the PRODUCT transformation, which can merge two weighted logic programs into a new one. The resulting program optimizes a product of proof scores from the original programs, constituting a scoring function known in machine learning as a ``product of experts.'' Through the addition of intuitive constraining side conditions, we show that several important dynamic programming algorithms can be derived by applying PRODUCT to weighted logic programs corresponding to simpler weighted logic programs. In addition, we show how the computation of Kullback-Leibler divergence, an information-theoretic measure, can be interpreted using PRODUCT.
MDPs with Unawareness
Halpern, Joseph Y., Rong, Nan, Saxena, Ashutosh
Markov decision processes (MDPs) are widely used for modeling decision-making problems in robotics, automated control, and economics. Traditional MDPs assume that the decision maker (DM) knows all states and actions. However, this may not be true in many situations of interest. We define a new framework, MDPs with unawareness (MDPUs) to deal with the possibilities that a DM may not be aware of all possible actions. We provide a complete characterization of when a DM can learn to play near-optimally in an MDPU, and give an algorithm that learns to play near-optimally when it is possible to do so, as efficiently as possible. In particular, we characterize when a near-optimal solution can be found in polynomial time.
Distantly Labeling Data for Large Scale Cross-Document Coreference
Singh, Sameer, Wick, Michael, McCallum, Andrew
Cross-document coreference, the problem of resolving entity mentions across multi-document collections, is crucial to automated knowledge base construction and data mining tasks. However, the scarcity of large labeled data sets has hindered supervised machine learning research for this task. In this paper we develop and demonstrate an approach based on ``distantly-labeling'' a data set from which we can train a discriminative cross-document coreference model. In particular we build a dataset of more than a million people mentions extracted from 3.5 years of New York Times articles, leverage Wikipedia for distant labeling with a generative model (and measure the reliability of such labeling); then we train and evaluate a conditional random field coreference model that has factors on cross-document entities as well as mention-pairs. This coreference model obtains high accuracy in resolving mentions and entities that are not present in the training data, indicating applicability to non-Wikipedia data. Given the large amount of data, our work is also an exercise demonstrating the scalability of our approach.
Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes
Petrik, Marek, Taylor, Gavin, Parr, Ron, Zilberstein, Shlomo
Approximate dynamic programming has been used successfully in a large variety of domains, but it relies on a small set of provided approximation features to calculate solutions reliably. Large and rich sets of features can cause existing algorithms to overfit because of a limited number of samples. We address this shortcoming using $L_1$ regularization in approximate linear programming. Because the proposed method can automatically select the appropriate richness of features, its performance does not degrade with an increasing number of features. These results rely on new and stronger sampling bounds for regularized approximate linear programs. We also propose a computationally efficient homotopy method. The empirical evaluation of the approach shows that the proposed method performs well on simple MDPs and standard benchmark problems.