Goto

Collaborating Authors

 Learning Graphical Models


Tractability through Exchangeability: A New Perspective on Efficient Probabilistic Inference

AAAI Conferences

Exchangeability is a central notion in statistics and probability theory. The assumption that an infinite sequence of data points is exchangeable is at the core of Bayesian statistics. However, finite exchangeability as a statistical property that renders probabilistic inference tractable is less well-understood. We develop a theory of finite exchangeability and its relation to tractable probabilistic inference. The theory is complementary to that of independence and conditional independence. We show that tractable inference in probabilistic models with high treewidth and millions of variables can be explained with the notion of finite (partial) exchangeability. We also show that existing lifted inference algorithms implicitly utilize a combination of conditional independence and partial exchangeability.


Predicting the Hardness of Learning Bayesian Networks

AAAI Conferences

There are various algorithms for finding a Bayesian networkstructure (BNS) that is optimal with respect to a given scoring function. No single algorithm dominates the others in speed, and, given a problem instance, it is a priori unclear which algorithm will perform best and how fast it will solve the problem. Estimating the runtimes directly is extremely difficult as they are complicated functions of the instance. The main contribution of this paper is characterization of the empirical hardness of an instance for a given algorithm based on a novel collection of non-trivial, yet efficiently computable features. Our empirical results, based on the largest evaluation of state-of-the-art BNS learning algorithms to date, demonstrate that we can predict the runtimes to a reasonable degree of accuracy, and effectively select algorithms that perform well on a particular instance. Moreover, we also show how the results can be utilized in building a portfolio algorithm that combines several individual algorithms in an almost optimal manner.


State Aggregation in Monte Carlo Tree Search

AAAI Conferences

Monte Carlo tree search (MCTS) algorithms are a popular approach to online decision-making in Markov decision processes (MDPs). These algorithms can, however, perform poorly in MDPs with high stochastic branching factors. In this paper, we study state aggregation as a way of reducing stochastic branching in tree search. Prior work has studied formal properties of MDP state aggregation in the context of dynamic programming and reinforcement learning, but little attention has been paid to state aggregation in MCTS. Our main result is a performance loss bound for a class of value function-based state aggregation criteria in expectimax search trees. We also consider how to construct MCTS algorithms that operate in the abstract state space but require a simulator of the ground dynamics only. We find that trajectory sampling algorithms like UCT can be adapted easily, but that sparse sampling algorithms present difficulties. As a proof of concept, we experimentally confirm that state aggregation can improve the finite-sample performance of UCT.


Tightening Bounds for Bayesian Network Structure Learning

AAAI Conferences

A recent breadth-first branch and bound algorithm (BFBnB)for learning Bayesian network structures (Maloneet al. 2011) uses two bounds to prune the searchspace for better efficiency; one is a lower bound calculatedfrom pattern database heuristics, and the otheris an upper bound obtained by a hill climbing search.Whenever the lower bound of a search path exceeds theupper bound, the path is guaranteed to lead to suboptimalsolutions and is discarded immediately. This paperintroduces methods for tightening the bounds. Thelower bound is tightened by using more informed variablegroupings when creating the pattern databases, andthe upper bound is tightened using an anytime learningalgorithm. Empirical results show that these boundsimprove the efficiency of Bayesian network learning bytwo to three orders of magnitude.


Finding the k-best Equivalence Classes of Bayesian Network Structures for Model Averaging

AAAI Conferences

In this paper we develop an algorithm to find the k-best equivalence classes of Bayesian networks. Our algorithm is capable of finding much more best DAGs than the previous algorithm that directly finds the k-best DAGs (Tian, He and Ram 2010). We demonstrate our algorithm in the task of Bayesian model averaging. Empirical results show that our algorithm significantly outperforms the k-best DAG algorithm in both time and space to achieve the same quality of approximation. Our algorithm goes beyond the maximum-a-posteriori (MAP) model by listing the most likely network structures and their relative likelihood and therefore has important applications in causal structure discovery.


Recovering from Selection Bias in Causal and Statistical Inference

AAAI Conferences

Selection bias is caused by preferential exclusion of units from the samples and represents a major obstacle to valid causal and statistical inferences; it cannot be removed by randomized experiments and can rarely be detected in either experimental or observational studies. In this paper, we provide complete graphical and algorithmic conditions for recovering conditional probabilities from selection biased data. We also provide graphical conditions for recoverability when unbiased data is available over a subset of the variables. Finally, we provide a graphical condition that generalizes the backdoor criterion and serves to recover causal effects when the data is collected under preferential selection.


Lifting Relational MAP-LPs Using Cluster Signatures

AAAI Conferences

Inference in large scale graphical models is an important task in many domains, and in particular probabilistic relational models (e.g. Markov logic networks). Such models often exhibit considerable symmetry, and it is a challenge to devise algorithms that exploit this symmetry to speed up inference. Recently, the automorphism group has been proposed to formalize mathematically what "exploiting symmetry" means. However, obtaining symmetry derived from automorphism is GI-hard, and consequently only a small fraction of the symmetry is easily available for effective employment. In this paper, we improve upon efficiency in two ways. First, we introduce the Cluster Signature Graph (CSG), a platform on which greater portions of the symmetries can be revealed and exploited. CSGs classify clusters of variables by projecting relations between cluster members onto a graph, allowing for the efficient pruning of symmetrical clusters even before their generation. Second, we introduce a novel framework based on CSGs for the Sherali-Adams hierarchy of linear program (LP) relaxations, dedicated to exploiting this symmetry for the benefit of tight Maximum A Posteriori (MAP) approximations. Combined with the pruning power of CSG, the framework quickly generates compact formulations for otherwise intractable LPs, as demonstrated by several empirical results.


A Relevance-Based Compilation Method for Conformant Probabilistic Planning

AAAI Conferences

Conformant probabilistic planning (CPP) differs from conformant planning (CP) by two key elements: the initial belief state is probabilistic,and the conformant plan must achieve the goal with probability $\geq\theta$, for some $0<\theta\leq 1$. In earlier work we observed that one can reduce CPP to CP by finding a set of initial states whose probability $\geq\theta$, for whicha conformant plan exists. In previous solvers we used the underlying planner to select this set of states and to plan for them simultaneously. Here we suggest an alternative approach: start with relevance analysis to determine a promising set of initial states on which to focus. Then, call an off-the-shelf conformant planner to solve the resulting problem. This approach has a number of advantages. First, instead of depending on the heuristic function to select the set of initial states,we can introduce specific, efficient relevance reasoning techniques. Second, we can benefit from optimizations used by conformant planners that are unsound when applied to the original CPP. Finally, we are free to use any existing (or new) CP solver. Consequently, the new planner dominates previous solvers on almost all domains and scales to instances that were not solved before.


Saturated Path-Constrained MDP: Planning under Uncertainty and Deterministic Model-Checking Constraints

AAAI Conferences

In many probabilistic planning scenarios, a system’s behavior needs to not only maximize the expected utility but also obey certain restrictions. This paper presents Saturated Path-Constrained Markov Decision Processes (SPC MDPs), a new MDP type for planning under uncertainty with deterministic model-checking constraints, e.g., "state s must be visited befores s'", "the system must end up in s", or "the system must never enter s". We present a mathematical analysis of SPCMDPs, showing that although SPC MDPs generally have no optimal policies, every instance of this class has an epsilon-optimal randomized policy for any > 0. We propose a dynamic programming-based algorithm for finding such policies, and empirically demonstrate this algorithm to be orders of magnitude faster than its next-best alternative.


Structured Possibilistic Planning Using Decision Diagrams

AAAI Conferences

Qualitative Possibilistic Mixed-Observable MDPs (pi-MOMDPs), generalizing pi-MDPs and pi-POMDPs, are well-suited models to planning under uncertainty with mixed-observability when transition, observation and reward functions are not precisely known and can be qualitatively described. Functions defining the model as well as intermediate calculations are valued in a finite possibilistic scale L, which induces a finite belief state space under partial observability contrary to its probabilistic counterpart. In this paper, we propose the first study of factored pi-MOMDP models in order to solve large structured planning problems under qualitative uncertainty, or considered as qualitative approximations of probabilistic problems. Building upon the SPUDD algorithm for solving factored (probabilistic) MDPs, we conceived a symbolic algorithm named PPUDD for solving factored pi-MOMDPs. Whereas SPUDD's decision diagrams' leaves may be as large as the state space since their values are real numbers aggregated through additions and multiplications, PPUDD's ones always remain in the finite scale L via min and max operations only. Our experiments show that PPUDD's computation time is much lower than SPUDD, Symbolic-HSVI and APPL for possibilistic and probabilistic versions of the same benchmarks under either total or mixed observability, while still providing high-quality policies.