Country
Sketching Techniques for Collaborative Filtering
Bachrach, Yoram (Microsoft Research Cambridge) | Porat, Ely (Bar Ilan University) | Rosenschein, Jeffrey S. (Hebrew University)
Recommender systems attempt to highlight items that a target user is likely to find interesting. A common technique is to use collaborative filtering (CF), where multiple users share information so as to provide each with effective recommendations. A key aspect of CF systems is finding users whose tastes accurately reflect the tastes of some target user. Typically, the system looks for other agents who have had experience with many of the items the target user has examined, and whose classification of these items has a strong correlation with the classifications of the target user. Since the universe of items may be enormous and huge data sets are involved, sophisticated methods must be used to quickly locate appropriate other agents. We present a method for quickly determining the proportional intersection between the items that each of two users has examined, by sending and maintaining extremely concise “sketches” of the list of items. These sketches enable the approximation of the proportional intersection within a distance of \epsilon, with a high probability of 1-\delta. Our sketching techniques are based on random minwise independent hash functions, and use very little space and time, so they are well-suited for use in large-scale collaborative filtering systems.
DL-liteR in the Light of Propositional Logic for Decentralized Data Management
Abdallah, Nada (LRI: Univ. Paris-Sud, CNRS, and INRIA) | Goasdoue, Francois (LRI: Univ. Paris-Sud, CNRS, and INRIA) | Rousset, Marie-Christine (LIG: Univ. Grenoble, CNRS, and INRIA)
This paper provides a decentralized data model and associated algorithms for peer data management systems (PDMS) based on the DL-liteR description logic. Our approach relies on reducing query reformulation and consistency checking for DL-liteR into reasoning in propositional logic. This enables a straightforward deployment of DL-liteR PDMSs on top of SomeWhere, a scalable propositional peer-to-peer inference system. We also show how to use the state-of-the-art Minicon algorithm for rewriting queries using views in DL-liteR in the centralized and decentralized cases.
A General Approach to Environment Design with One Agent
Zhang, Haoqi (Harvard University) | Chen, Yiling (Harvard University) | Parkes, David C. (Harvard University)
The problem of environment design considers a setting in which an interested party aims to influence an agent's decisions by making limited changes to the agent's environment. Zhang and Parkes [2008] first introduced the environment design concept for a specific problem in the Markov Decision Process setting. In this paper, we present a general framework for the formulation and solution of environment design problems. We consider both the case in which the agent's local decision model is known and partially unknown to the interested party, and illustrate the framework and results on a linear programming setting. For the latter problem, we formulate an active, indirect elicitation method and provide conditions for convergence and logarithmic convergence. We relate to the problem of inverse optimization and also offer a game-theoretic interpretation of our methods.
Speeding Up Inference in Markov Logic Networks by Preprocessing to Reduce the Size of the Resulting Grounded Network
Shavlik, Jude (University of Wisconsin Madison) | Natarajan, Sriraam (University of Wisconsin Madison)
Statistical-relational reasoning has received much attention due to its ability to robustly model complex relationships. A key challenge is tractable inference, especially in domains involving many objects, due to the combinatorics involved. One can accelerate inference by using approximation techniques, lazy algorithms, etc. We consider Markov Logic Networks (MLNs), which involve counting how often logical formulae are satisfied. We propose a preprocessing algorithm that can substantially reduce the effective size of MLNs by rapidly counting how often the evidence satisfies each formula, regardless of the truth values of the query literals. This is a general preprocessing method that loses no information and can be used for any MLN inference algorithm. We evaluate our algorithm empirically in three real-world domains, greatly reducing the work needed during subsequent inference. Such reduction might even allow exact inference to be performed when sampling methods would be otherwise necessary.
A Syntax-based Framework for Merging Imprecise Probabilistic Logic Programs
Yue, Anbu (Queen's University Belfast) | Liu, Weiru (Queen's University Belfast)
In this paper, we address the problem of merging multiple imprecise probabilistic beliefs represented as Probabilistic Logic Programs (PLPs) obtained from multiple sources. Beliefs in each PLP are modeled as conditional events attached with probability bounds. The major task of syntax-based merging is to obtain the most rational probability bound for each conditional event from the original PLPs to form a new PLP. We require the minimal change principle to be followed so that each source gives up its beliefs as little as possible. Some instantiated merging operators are derived from our merging framework. Furthermore, we propose a set of postulates for merging PLPs, some of which extend the postulates for merging classical knowledge bases, whilst others are specific to the merging of probabilistic beliefs.
Efficient Computation of Jointree Bounds for Systematic MAP Search
Yuan, Changhe (Mississippi State University) | Hansen, Eric A. (Mississippi State University)
The MAP (maximum a posteriori assignment) problem in Bayesian networks is the problem of finding the most probable instantiation of a set of variables given partial evidence for the remaining variables. The state-of-the-art exact solution method is depth-first branch-and-bound search using dynamic variable ordering and a jointree upper bound proposed by Park and Darwiche [2003]. Since almost all search time is spent computing the jointree bounds, we introduce an efficient method for computing these bounds incrementally. We point out that, using a static variable ordering, it is only necessary to compute relevant upper bounds at each search step, and it is also possible to cache potentials of the jointree for efficient backtracking. Since the jointree computation typically produces bounds for joint configurations of groups of variables, our method also instantiates multiple variables at each search step, instead of a single variable, in order to reduce the number of times that upper bounds need to be computed. Experiments show that this approach leads to orders of magnitude reduction in search time.
Parameter Identification in a Class of Linear Structural Equation Models
Tian, Jin (Iowa State University)
Linear causal models known as structural equation models (SEMs) are widely used for data analysis in the social sciences, economics, and artificial intelligence, in which random variables are assumed to be continuous and normally distributed. This paper deals with one fundamental problem in the applications of SEMs -- parameter identification. The paper uses the graphical models approach and provides a procedure for solving the identification problem in a special class of SEMs.
Variable and Value Ordering for MPE Search
Siddiqi, Sajjad Ahmed (Australian National University and National ICT Australia) | Huang, Jinbo (Australian National University and National ICT Australia)
In Bayesian networks, a most probable explanation (MPE) is a most likely instantiation of all network variables given a piece of evidence. Solving (the decision version of) an MPE query is NP-hard. Recent work proposed a branch-and-bound search algorithm that finds exact solutions to MPE queries, where bounds are computed on a relaxed network obtained by a technique known as node splitting. In this work we study the impact of variable and value ordering on such a search algorithm. We study several heuristics based on the entropies of variables and on the notion of nogoods, and propose a new meta-heuristic that combines their strengths. Experiments indicate that search efficiency is significantly improved, allowing many hard problems to be solved for the first time.
Testing Edges by Truncations
Shpitser, Ilya (Harvard University) | Richardson, Thomas S. (University of Washington) | Robins, James M. (Harvard University)
We consider the problem of testing whether two variables should be adjacent (either due to a direct effect between them, or due to a hidden common cause) given an observational distribution, and a set of causal assumptions encoded as a causal diagram. In other words, given a set of edges in the diagram known to be true, we are interested in testing whether another edge ought to be in the diagram. In fully observable faithful models this problem can be easily solved with conditional independence tests. Latent variables make the problem significantly harder since they can imply certain non-adjacent variable pairs, namely those connected by so called inducing paths, are not independent conditioned on any set of variables. We characterizewhich variable pairs can be determined to be non-adjacent by a class of constraints due to dormant independence, that is conditional independence in identifiable interventional distributions. Furthermore, we show that particular operations on joint distributions, which we call truncations are sufficient for exhibiting these non-adjacencies.This suggests a causal discovery procedure taking advantage of these constraints in the latent variable case can restrict itself to truncations.
Speeding Up Exact Solutions of Interactive Dynamic Influence Diagrams Using Action Equivalence
Zeng, Yifeng (Aalborg University) | Doshi, Prashant
Interactive dynamic influence diagrams (I-DIDs) are graphical models for sequential decision making in partially observable settings shared by other agents. Algorithms for solving I-DIDs face the challenge of an exponentially growing space of candidate models ascribed to other agents, over time. Previous approach for exactly solving I-DIDs groups together models having similar solutions into behaviorally equivalent classes and updates these classes. We present a new method that, in addition to aggregating behaviorally equivalent models, further groups models that prescribe identical actions at a single time step. We show how to update these augmented classes and prove that our method is exact. The new approach enables us to bound the aggregated model space by the cardinality of other agents' actions. We evaluate its performance and provide empirical results in support.