Undirected Networks
kLog: A Language for Logical and Relational Learning with Kernels
Frasconi, Paolo, Costa, Fabrizio, De Raedt, Luc, De Grave, Kurt
We introduce kLog, a novel approach to statistical relational learning. Unlike standard approaches, kLog does not represent a probability distribution directly. It is rather a language to perform kernel-based learning on expressive logical and relational representations. kLog allows users to specify learning problems declaratively. It builds on simple but powerful concepts: learning from interpretations, entity/relationship data modeling, logic programming, and deductive databases. Access by the kernel to the rich representation is mediated by a technique we call graphicalization: the relational representation is first transformed into a graph --- in particular, a grounded entity/relationship diagram. Subsequently, a choice of graph kernel defines the feature space. kLog supports mixed numerical and symbolic data, as well as background knowledge in the form of Prolog or Datalog programs as in inductive logic programming systems. The kLog framework can be applied to tackle the same range of tasks that has made statistical relational learning so popular, including classification, regression, multitask learning, and collective classification. We also report about empirical comparisons, showing that kLog can be either more accurate, or much faster at the same level of accuracy, than Tilde and Alchemy. kLog is GPLv3 licensed and is available at http://klog.dinfo.unifi.it along with tutorials.
Learning Structured Outputs from Partial Labels using Forest Ensemble
Tran, Truyen, Phung, Dinh, Venkatesh, Svetha
Learning Structured Outputs from Partial Labels using Forest Ensemble Truyen Tran, Dinh Phung, Svetha V enkatesh Centre for Pattern Recognition and Data Analytics Deakin University, Australia Abstract Learning structured outputs with general structures is computationally challenging, except for tree-structured models. Thus we propose an efficient boosting-based algorithm AdaBoost.MRF for this task. The idea is based on the realization that a graph is a superimposition of trees. Different from most existing work, our algorithm can handle partial labelling, and thus is particularly attractive in practice where reliable labels are often sparsely observed. In addition, our method works exclusively on trees and thus is guaranteed to converge. We apply the AdaBoost.MRF algorithm to an indoor video surveillance scenario, where activities are modelled at multiple levels. 1 Introduction There has been a growing research interest in developing probabilistic temporal graphical models for recognising human activities from sensory data. In this paper we address an important aspect of the problem in that there are multiple levels of abstraction, that is, an activity is often composed of several sub-activities. A popular approach to deal with such a hierarchical nature is to build a cascaded model: each level is modelled separately, and the output of the lower levels is subsequently used as the input for the upper levels [20]. This approach is sub-optimal because the information at the higher level is often very discriminative to infer about the lower levels, but it is not modelled. Moreover, the layered approach often suffers from the so-called cascading error problem, as the error introduced from the lower level will propagate to higher tasks. A better and more holistic approach is to build a joint representation at all layers. Emerging methods include generative/directed models such as abstract hidden Markov models (AH-MMs) [4], hierarchical HMMs [19], dynamic Bayesian networks [10], and their discriminative/undirected counterparts such as hierarchical conditional random field (HCRF) [17], and dynamic CRF (DCRF) [28].
Scalable Learning for Structure in Markov Logic Networks
Sun, Zhengya (Chinese Academy of Sciences) | Wei, Zhuoyu (Chinese Academy of Sciences) | Wang, Jue (Chinese Academy of Sciences) | Hao, Hongwei (Chinese Academy of Sciences)
Markov Logic Networks (MLNs) provide a unifying framework that incorporates first-order logic and probability. However, learning the structure of MLNs is a computationally hard task due to the large search space and the intractable clause evaluation. In this paper, we propose a random walk-based approach to learn MLN structure in a scalable manner. It uses the interactions existing among the objects to constrain the search space of candidate clauses. Specifically, we obtain representative subset of simple paths by sampling from all sequences of distinct objects. We then transform each sampled path into possible ground atoms, and use them to form clauses. Based on the resulting ground network, we finally attach a set of weights to the clauses by optimizing l 1 -constrained conditional likelihood. The experimental results demonstrate that our approach performs favorably compared to previous approaches.
Efficient Markov Logic Inference for Natural Language Semantics
Beltagy, Islam (The University of Texas at Austin) | Mooney, Raymond J. (The University of Texas at Austin)
Using Markov logic to integrate logical and distributional information in natural-language semantics results in complex inference problems involving long, complicated formulae. Current inference methods for Markov logic are ineffective on such problems. To address this problem, we propose a new inference algorithm based on SampleSearch that computes probabilities of complete formulae rather than ground atoms. We also introduce a modified closed-world assumption that significantly reduces the size of the ground network, thereby making inference feasible. Our approach is evaluated on the recognizing textual entailment task, and experiments demonstrate its dramatic impact on the efficiency of inference.
Learning Tractable Statistical Relational Models
Nath, Aniruddh (University of Washington) | Domingos, Pedro M. (University of Washington)
Intractable inference has been a major barrier to the wide adoption of statistical relational models. Existing exact methods suffer from a lack of scalability, and approximate methods tend to be unreliable. Sum-product networks (SPNs; Poon and Domingos 2011) are a recently-proposed probabilistic architecture that guarantees tractable exact inference, even on many high-treewidth models. SPNs are a propositional architecture, treating the instances as independent and identically distributed. In this paper, we extend SPNs to the relational setting, resulting in Relational Sum-Product Networks (RSPNs). Previous tractable statistical relational models (Domingos and Webb 2012; Webb and Domingos 2013) defined their models over a pre-determined set of objects, and therefore could not be generalized to new mega-examples. In contrast, RSPNs can be learned and applied to previous unseen examples. We present a learning algorithm for RSPNs; in preliminary experiments, RSPNs outperform Markov Logic Networks (Richardson and Domingos 2006) in both running time and predictive accuracy.
Lifting Relational MAP-LPs using Cluster Signatures
Apsel, Udi (Ben-Gurion University of The Negev) | Kersting, Kristian (TU Dortmund University) | Mladenov, Martin (TU Dortmund University)
Inference in large scale graphical models is an important task in many domains, and in particular probabilistic relational models (e.g. Markov logic networks). Such models often exhibit considerable symmetry, and it is a challenge to devise algorithms that exploit this symmetry to speed up inference. Recently, the automorphism group has been proposed to formalize mathematically what "exploiting symmetry" means. However, obtaining symmetry derived from automorphism is GI-hard, and consequently only a small fraction of the symmetry is easily available for effective employment. In this paper, we improve upon efficiency in two ways. First, we introduce the Cluster Signature Graph (CSG), a platform on which greater portions of the symmetries can be revealed and exploited. CSGs classify clusters of variables by projecting relations between cluster members onto a graph, allowing for the efficient pruning of symmetrical clusters even before their generation. Second, we introduce a novel framework based on CSGs for the Sherali-Adams hierarchy of linear program (LP) relaxations, dedicated to exploiting this symmetry for the benefit of tight Maximum A Posteriori (MAP) approximations. Combined with the pruning power of CSG, the framework quickly generates compact formulations for otherwise intractable LPs, as demonstrated by several empirical results.
An Automated Measure of MDP Similarity for Transfer in Reinforcement Learning
Ammar, Haitham Bou (University of Pennsylvania) | Eaton, Eric (University of Pennsylvania) | Taylor, Matthew E. (Washington State University) | Mocanu, Decebal Constantin (Eindhoven University of Technology) | Driessens, Kurt (Maastricht University) | Weiss, Gerhard (Maastricht University) | Tuyls, Karl (University of Liverpool)
Transfer learning can improve the reinforcement learning of a new task by allowing the agent to reuse knowledge acquired from other source tasks. Despite their success, transfer learning methods rely on having relevant source tasks; transfer from inappropriate tasks can inhibit performance on the new task. For fully autonomous transfer, it is critical to have a method for automatically choosing relevant source tasks, which requires a similarity measure between Markov Decision Processes (MDPs). This issue has received little attention, and is therefore still a largely open problem. This paper presents a data-driven automated similarity measure for MDPs. This novel measure is a significant step toward autonomous reinforcement learning transfer, allowing agents to: (1) characterize when transfer will be useful and, (2) automatically select tasks to use for transfer. The proposed measure is based on the reconstruction error of a restricted Boltzmann machine that attempts to model the behavioral dynamics of the two MDPs being compared. Empirical results illustrate that this measure is correlated with the performance of transfer and therefore can be used to identify similar source tasks for transfer learning.
A Virtual Director Inspired by Real Directors
Merabti, Billal (Polytechnic Military School, Algiers and IRISA/University of Rennes 1) | Christie, Marc (IRISA/University of Rennes 1) | Bouatouch, Kadi (IRISA/University of Rennes 1)
Automatically computing a cinematographically consistent sequence of shots over a set of actions occurring in a 3D world is a complex task that requires not only the computation of appropriate shots (viewpoints) and appropriate transitions between shots (cuts), but more importantly the ability to encode and reproduce elements of cinematographic style that exist in real movies. In this paper, we propose an expressive automated cinematography model that learns some elements of style from real movies and reproduces them in synthetic movies. The model relies on a Hidden Markov Model representation of the editing process. The proposed model is more general than existing representations that encode cinematographic idioms and proves to be more expressive in the possible variations of style it offers.
Evidence-Based Clustering for Scalable Inference in Markov Logic
Venugopal, Deepak (The University of Texas at Dallas) | Gogate, Vibhav (The University of Texas at Dallas)
Lifted inference algorithms take advantage of symmetries in first-order probabilistic logic representations such as Markov logic networks (MLNs), and are naturally more scalable than propositional inference algorithms which ground the MLN. However, lifted inference algorithms have an "evidence problem" -- evidence breaks symmetries, and the performance of lifted inference algorithms is the same as propositional inference algorithms (or sometimes worse, due to overhead). In this paper, we propose a general method for addressing this problem. The main idea in our method is to approximate the given MLN having, say, n objects by an MLN having k objects such that k is much lesser than n and the results obtained by running potentially much faster inference on the smaller MLN are as close as possible to the ones obtained by running inference on the larger MLN. We achieve this by finding clusters of "similar" groundings using standard clustering algorithms (e.g., K-means), and replacing all groundings in the cluster by their cluster center. To this end, we develop a novel distance (or similarity) function for measuring the similarity between two groundings, based on the evidence presented to the MLN. We evaluated our approach on different benchmarks utilizing various clustering and inference algorithms. Our experiments clearly show the generality and scalability of our approach.
Relational Logistic Regression: The Directed Analog of Markov Logic Networks
Kazemi, Seyed Mehran (University of British Columbia) | Buchman, David (University of British Columbia) | Kersting, Kristian ( Technical University of Dortmund ) | Natarajan, Sriraam ( Indiana University ) | Poole, David (University of British Columbia)
Relational logistic regression (RLR) was presented at the 14th International Conference on Principles of Knowledge Representation and Reasoning (KR-2014). RLR is the directed analogue of Markov logic networks. Whereas Markov logic networks define distributions in terms of weighted formulae, RLR defines conditional probabilities in terms of weighted formulae. They agree for the supervised learning case when all variables except a query leaf variable are observed. However, they are quite different in representing distributions. The KR-2014 paper defined the RLR formalism, defined canonical forms for RLR in terms of positive conjunctive formulae, indicated the class of conditional probability distributions that can and cannot be represented by RLR, and defined many other aggregators in terms of RLR. In this paper, we summarize these results and compare RLR to Markov logic networks.