Learning Graphical Models
Learning Tractable Statistical Relational Models
Nath, Aniruddh (University of Washington) | Domingos, Pedro M. (University of Washington)
Intractable inference has been a major barrier to the wide adoption of statistical relational models. Existing exact methods suffer from a lack of scalability, and approximate methods tend to be unreliable. Sum-product networks (SPNs; Poon and Domingos 2011) are a recently-proposed probabilistic architecture that guarantees tractable exact inference, even on many high-treewidth models. SPNs are a propositional architecture, treating the instances as independent and identically distributed. In this paper, we extend SPNs to the relational setting, resulting in Relational Sum-Product Networks (RSPNs). Previous tractable statistical relational models (Domingos and Webb 2012; Webb and Domingos 2013) defined their models over a pre-determined set of objects, and therefore could not be generalized to new mega-examples. In contrast, RSPNs can be learned and applied to previous unseen examples. We present a learning algorithm for RSPNs; in preliminary experiments, RSPNs outperform Markov Logic Networks (Richardson and Domingos 2006) in both running time and predictive accuracy.
Lifting Relational MAP-LPs using Cluster Signatures
Apsel, Udi (Ben-Gurion University of The Negev) | Kersting, Kristian (TU Dortmund University) | Mladenov, Martin (TU Dortmund University)
Inference in large scale graphical models is an important task in many domains, and in particular probabilistic relational models (e.g. Markov logic networks). Such models often exhibit considerable symmetry, and it is a challenge to devise algorithms that exploit this symmetry to speed up inference. Recently, the automorphism group has been proposed to formalize mathematically what "exploiting symmetry" means. However, obtaining symmetry derived from automorphism is GI-hard, and consequently only a small fraction of the symmetry is easily available for effective employment. In this paper, we improve upon efficiency in two ways. First, we introduce the Cluster Signature Graph (CSG), a platform on which greater portions of the symmetries can be revealed and exploited. CSGs classify clusters of variables by projecting relations between cluster members onto a graph, allowing for the efficient pruning of symmetrical clusters even before their generation. Second, we introduce a novel framework based on CSGs for the Sherali-Adams hierarchy of linear program (LP) relaxations, dedicated to exploiting this symmetry for the benefit of tight Maximum A Posteriori (MAP) approximations. Combined with the pruning power of CSG, the framework quickly generates compact formulations for otherwise intractable LPs, as demonstrated by several empirical results.
An Automated Measure of MDP Similarity for Transfer in Reinforcement Learning
Ammar, Haitham Bou (University of Pennsylvania) | Eaton, Eric (University of Pennsylvania) | Taylor, Matthew E. (Washington State University) | Mocanu, Decebal Constantin (Eindhoven University of Technology) | Driessens, Kurt (Maastricht University) | Weiss, Gerhard (Maastricht University) | Tuyls, Karl (University of Liverpool)
Transfer learning can improve the reinforcement learning of a new task by allowing the agent to reuse knowledge acquired from other source tasks. Despite their success, transfer learning methods rely on having relevant source tasks; transfer from inappropriate tasks can inhibit performance on the new task. For fully autonomous transfer, it is critical to have a method for automatically choosing relevant source tasks, which requires a similarity measure between Markov Decision Processes (MDPs). This issue has received little attention, and is therefore still a largely open problem. This paper presents a data-driven automated similarity measure for MDPs. This novel measure is a significant step toward autonomous reinforcement learning transfer, allowing agents to: (1) characterize when transfer will be useful and, (2) automatically select tasks to use for transfer. The proposed measure is based on the reconstruction error of a restricted Boltzmann machine that attempts to model the behavioral dynamics of the two MDPs being compared. Empirical results illustrate that this measure is correlated with the performance of transfer and therefore can be used to identify similar source tasks for transfer learning.
A Virtual Director Inspired by Real Directors
Merabti, Billal (Polytechnic Military School, Algiers and IRISA/University of Rennes 1) | Christie, Marc (IRISA/University of Rennes 1) | Bouatouch, Kadi (IRISA/University of Rennes 1)
Automatically computing a cinematographically consistent sequence of shots over a set of actions occurring in a 3D world is a complex task that requires not only the computation of appropriate shots (viewpoints) and appropriate transitions between shots (cuts), but more importantly the ability to encode and reproduce elements of cinematographic style that exist in real movies. In this paper, we propose an expressive automated cinematography model that learns some elements of style from real movies and reproduces them in synthetic movies. The model relies on a Hidden Markov Model representation of the editing process. The proposed model is more general than existing representations that encode cinematographic idioms and proves to be more expressive in the possible variations of style it offers.
A Bayesian Approach to Determine Focus of Attention in Spatial and Time-Sensitive Decision Making Scenarios
Li, Yu-Ting (Purdue University) | Wachs, Juan Pablo (Purdue University)
Complex decision making scenarios require maintaining high level of concentration and acquiring knowledge about the context of the task in hand. Focus of attention is not only affected by contextual factors but also by the way operators interact with the information. Conversely, determining optimal ways to interact with this information can augment operators’ cognition. However, challenges exist for determining efficient mathematical frameworks and sound metrics to infer, reason and assess the level of attention during spatio-temporal complex problem solving in hybrid human-machine systems. This paper proposes a computational framework based on a Bayesian approach (BAN) to infer users’ focus of attention based on physical expression generated from embodied interaction and further support decision-making in an unobtrusive manner. Experiments involving five interaction modalities (vision-based gesture interaction, glove-based gesture interaction, speech, feet, and body balance) were conducted to assess the proposed framework’s feasibility including the likelihood of assessed attention from enhanced BAN and task performance. Results confirm that physical expressions have a determining effect in the quality of the solutions in spatio-navigational type of problems.
Using Dynamic Bayesian Networks for Incorporating Non-Traditional Data Sources in Public Health Surveillance
Izadi, Masoumeh (McGill Uinversity) | Charland, Katia (McGill University) | Buckeridge, David (McGill University)
It is generally challenging to obtain the exact disease prevalence, as the true cases of a disease in the population level are not easy to identify. Available and relevant data sources such as administrative or clinical health data are used in public health surveillance as a proxy to estimate the disease prevalence. Traditionally, these data sources span through healthcare utilization information such as emergency department visits, pharmacy drug sales, or laboratory test orders. In addition to incompleteness, these data sources are not usually available in a timely manner. Timeliness is an important factor for prevalence estimation for some conditions such as infectious diseases, especially at the time of an epidemic. For instance, in an influenza pandemic such estimates must be obtained within a day or two. In recent years several non-clinical and non-traditional data sources have been introduced to public health with the potentials to provide signals on a disease rate or to provide a feedback on the trends of a disease. Ideally, combining these new sources with the ones routinely used should help to identify disease cases more efficiently. However, building a construct capable of incorporating data from these various sources in a coherent manner is not trivial. In this research, we consider the case of H1N1 pandemic as the infectious disease of interest and we use media reports of deaths from H1N1 on the web as a non traditional data source. We propose to use dynamic Bayesian networks from the class of probabilistic graphical models in order to combine this new data source with traditional ones through exploration of the possible probabilistic relationships between these data streams. This is an initial step towards building a framework that can potentially support aggregation of heterogeneous data for a real-time estimation of a disease prevalence. Our preliminary results show that the proposed model generalizes well.
Evidence-Based Clustering for Scalable Inference in Markov Logic
Venugopal, Deepak (The University of Texas at Dallas) | Gogate, Vibhav (The University of Texas at Dallas)
Lifted inference algorithms take advantage of symmetries in first-order probabilistic logic representations such as Markov logic networks (MLNs), and are naturally more scalable than propositional inference algorithms which ground the MLN. However, lifted inference algorithms have an "evidence problem" -- evidence breaks symmetries, and the performance of lifted inference algorithms is the same as propositional inference algorithms (or sometimes worse, due to overhead). In this paper, we propose a general method for addressing this problem. The main idea in our method is to approximate the given MLN having, say, n objects by an MLN having k objects such that k is much lesser than n and the results obtained by running potentially much faster inference on the smaller MLN are as close as possible to the ones obtained by running inference on the larger MLN. We achieve this by finding clusters of "similar" groundings using standard clustering algorithms (e.g., K-means), and replacing all groundings in the cluster by their cluster center. To this end, we develop a novel distance (or similarity) function for measuring the similarity between two groundings, based on the evidence presented to the MLN. We evaluated our approach on different benchmarks utilizing various clustering and inference algorithms. Our experiments clearly show the generality and scalability of our approach.
Relational Logistic Regression: The Directed Analog of Markov Logic Networks
Kazemi, Seyed Mehran (University of British Columbia) | Buchman, David (University of British Columbia) | Kersting, Kristian ( Technical University of Dortmund ) | Natarajan, Sriraam ( Indiana University ) | Poole, David (University of British Columbia)
Relational logistic regression (RLR) was presented at the 14th International Conference on Principles of Knowledge Representation and Reasoning (KR-2014). RLR is the directed analogue of Markov logic networks. Whereas Markov logic networks define distributions in terms of weighted formulae, RLR defines conditional probabilities in terms of weighted formulae. They agree for the supervised learning case when all variables except a query leaf variable are observed. However, they are quite different in representing distributions. The KR-2014 paper defined the RLR formalism, defined canonical forms for RLR in terms of positive conjunctive formulae, indicated the class of conditional probability distributions that can and cannot be represented by RLR, and defined many other aggregators in terms of RLR. In this paper, we summarize these results and compare RLR to Markov logic networks.
Representation, Reasoning, and Learning for a Relational Influence Diagram Applied to a Real-Time Geological Domain
Dirks, Matthew C. (University of British Columbia) | Csinger, Andrew (MineSense Technologies Ltd.) | Bamber, Andrew (MineSense Technologies Ltd.) | Poole, David (University of British Columbia)
Mining companies typically process all the material extracted from a mine site using processes which are extremely consumptive of energy and chemicals. Sorting the good material from the bad would effectively reduce required resources by leaving behind the bad material and only transporting and processing the good material. We use a relational influence diagram with an explicit utility model applied to the scenario in which an unknown number of rocks in unknown positions with unknown mineral compositions pass over 7 sensors toward 7 diverters on a high-throughput rock-sorting machine developed by MineSense Technologies Ltd. After receiving noisy sensor data, the system has 400 ms to decide whether to activate diverters which will divert the rocks into either a keep or discard bin. We learn the model offline and do online inference. Our result improves over the current state-of-the-art.
Reasoning in the Description Logic BEL Using Bayesian Networks
Ceylan, Ismail Ilkan (Technische Universitaet Dresden) | Penaloza, Rafael (Technische Universitaet Dresden)
We study the problem of reasoning in the probabilistic Description Logic BEL. Using a novel structure, we show that probabilistic reasoning in this logic can be reduced in polynomial time to standard inferences over a Bayesian network. This reduction provides tight complexity bounds for probabilistic reasoning in BEL.