Goto

Collaborating Authors

 Learning Graphical Models


Distance Minimization for Reward Learning from Scored Trajectories

AAAI Conferences

Many planning methods rely on the use of an immediate reward function as a portable and succinct representation of desired behavior. Rewards are often inferred from demonstrated behavior that is assumed to be near-optimal. We examine a framework, Distance Minimization IRL (DM-IRL), for learning reward functions from scores an expert assigns to possibly suboptimal demonstrations. By changing the expert’s role from a demonstrator to a judge, DM-IRL relaxes some of the assumptions present in IRL, enabling learning from the scoring of arbitrary demonstration trajectories with unknown transition functions. DM-IRL complements existing IRL approaches by addressing different assumptions about the expert. We show that DM-IRL is robust to expert scoring error and prove that finding a policy that produces maximally informative trajectories for an expert to score is strongly NP-hard. Experimentally, we demonstrate that the reward function DM-IRL learns from an MDP with an unknown transition model can transfer to an agent with known characteristics in a novel environment, and we achieve successful learning with limited available training data.


Separators and Adjustment Sets in Markov Equivalent DAGs

AAAI Conferences

In practice the vast majority of causal effect estimations from observational data are computed using adjustment sets which avoid confounding by adjusting for appropriate covariates. Recently several graphical criteria for selecting adjustment sets have been proposed. They handle causal directed acyclic graphs (DAGs) as well as more general types of graphs that represent Markov equivalence classes of DAGs, including completed partially directed acyclic graphs (CPDAGs). Though expressed in graphical language, it is not obvious how the criteria can be used to obtain effective algorithms for finding adjustment sets. In this paper we provide a new criterion which leads to an efficient algorithmic framework to find, test and enumerate covariate adjustments for chain graphs - mixed graphs representing in a compact way a broad range of Markov equivalence classes of DAGs.


RAO*: An Algorithm for Chance-Constrained POMDP's

AAAI Conferences

Autonomous agents operating in partially observable stochastic environments often face the problem of optimizing expected performance while bounding the risk of violating safety constraints. Such problems can be modeled as chance-constrained POMDP's (CC-POMDP's). Our first contribution is a systematic derivation of execution risk in POMDP domains, which improves upon how chance constraints are handled in the constrained POMDP literature. Second, we present RAO*, a heuristic forward search algorithm producing optimal, deterministic, finite-horizon policies for CC-POMDP's. In addition to the utility heuristic, RAO* leverages an admissible execution risk heuristic to quickly detect and prune overly-risky policy branches. Third, we demonstrate the usefulness of RAO* in two challenging domains of practical interest: power supply restoration and autonomous science agents.


Learning Ensembles of Cutset Networks

AAAI Conferences

Cutset networks — OR (decision) trees that have Bayesian networks whose treewidth is bounded by one at each leaf — are a new class of tractable probabilistic models that admit fast, polynomial-time inference and learning algorithms. This is unlike other state-of-the-art tractable models such as thin junction trees, arithmetic circuits and sum-product networks in which inference is fast and efficient but learning can be notoriously slow. In this paper, we take advantage of this unique property to develop fast algorithms for learning ensembles of cutset networks. Specifically, we consider generalized additive mixtures of cutset networks and develop sequential boosting-based and parallel bagging-based approaches for learning them from data. We demonstrate, via a thorough experimental evaluation, that our new algorithms are superior to competing approaches in terms of test-set log-likelihood score and learning time.


Learning Bayesian Networks with Bounded Tree-width via Guided Search

AAAI Conferences

Bounding the tree-width of a Bayesian network can reduce the chance of overfitting, and allows exact inference to be performed efficiently. Several existing algorithms tackle the problem of learning bounded tree-width Bayesian networks by learning from k-trees as super-structures, but they do not scale to large domains and/or large tree-width. We propose a guided search algorithm to find k-trees with maximum Informative scores, which is a measure of quality for the k-tree in yielding good Bayesian networks. The algorithm achieves close to optimal performance compared to exact solutions in small domains, and can discover better networks than existing approximate methods can in large domains. It also provides an optimal elimination order of variables that guarantees small complexity for later runs of exact inference. Comparisons with well-known approaches in terms of learning and inference accuracy illustrate its capabilities.


Closed-Form Gibbs Sampling for Graphical Models with Algebraic Constraints

AAAI Conferences

Probabilistic inference in many real-world problems requires graphical models with deterministic algebraic constraints between random variables (e.g., Newtonian mechanics, Pascal’s law, Ohm’s law) that are known to be problematic for many inference methods such as Monte Carlo sampling. Fortunately, when such constraintsare invertible, the model can be collapsed and the constraints eliminated through the well-known Jacobian-based change of variables. As our first contributionin this work, we show that a much broader classof algebraic constraints can be collapsed by leveraging the properties of a Dirac delta model of deterministic constraints. Unfortunately, the collapsing processcan lead to highly piecewise densities that pose challenges for existing probabilistic inference tools. Thus,our second contribution to address these challenges is to present a variation of Gibbs sampling that efficiently samples from these piecewise densities. The key insight to achieve this is to introduce a class of functions that (1) is sufficiently rich to approximate arbitrary models up to arbitrary precision, (2) is closed under dimension reduction (collapsing) for models with (non)linear algebraic constraints and (3) always permits one analytical integral sufficient to automatically derive closed-form conditionals for Gibbs sampling. Experiments demonstrate the proposed sampler converges at least an order of magnitude faster than existing Monte Carlo samplers.


On Learning Causal Models from Relational Data

AAAI Conferences

Many applications call for learning causal models from relational data. We investigate Relational Causal Models (RCM) under relational counterparts of adjacency-faithfulness and orientation-faithfulness, yielding a simple approach to identifying a subset of relational d-separation queries needed for determining the structure of an RCM using d-separation against an unrolled DAG representation of the RCM. We provide original theoretical analysis that offers the basis of a sound and efficient algorithm for learning the structure of an RCM from relational data. We describe RCD-Light, a sound and efficient constraint-based algorithm that is guaranteed to yield a correct partially-directed RCM structure with at least as many edges oriented as in that produced by RCD, the only other existing algorithm for learning RCM. We show that unlike RCD, which requires exponential time and space, RCD-Light requires only polynomial time and space to orient the dependencies of a sparse RCM.


On Parameter Tying by Quantization

AAAI Conferences

The maximum likelihood estimator (MLE) is generally asymptotically consistent but is susceptible to over-fitting. To combat this problem, regularization methods which reduce the variance at the cost of (slightly) increasing the bias are often employed in practice. In this paper, we present an alternative variance reduction (regularization) technique that quantizes the MLE estimates as a post processing step, yielding a smoother model having several tied parameters. We provide and prove error bounds for our new technique and demonstrate experimentally that it often yields models having higher test-set log-likelihood than the ones learned using the MLE. We also propose a new importance sampling algorithm for fast approximate inference in models having several tied parameters. Our experiments show that our new inference algorithm is superior to existing approaches such as Gibbs sampling and MC-SAT on models having tied parameters, learned using our quantization-based approach.


Structured Features in Naive Bayes Classification

AAAI Conferences

We propose the structured naive Bayes (SNB) classifier, which augments the ubiquitous naive Bayes classifier with structured features. SNB classifiers facilitate the use of complex features, such as combinatorial objects (e.g., graphs, paths and orders) in a general but systematic way. Underlying the SNB classifier is the recently proposed Probabilistic Sentential Decision Diagram (PSDD), which is a tractable representation of probability distributions over structured spaces. We illustrate the utility and generality of the SNB classifier via case studies. First, we show how we can distinguish players of simple games in terms of play style and skill level based purely on observing the games they play. Second, we show how we can detect anomalous paths taken on graphs based purely on observing the paths themselves.


A Symbolic SAT-Based Algorithm for Almost-Sure Reachability with Small Strategies in POMDPs

AAAI Conferences

The qualitative problem is of great importance as in several applications it is The de facto model for dynamic systems with probabilistic required that the correct behavior happens with probability 1, and nondeterministic behavior are Markov decision processes e.g., in the analysis of randomized embedded schedulers, (MDPs) (Howard 1960). MDPs provide the appropriate the important question is whether every thread progresses model to solve control and probabilistic planning problems with probability 1. Also in applications where it might be (Filar and Vrieze 1997; Puterman 1994), where the nondeterminism sufficient that the correct behavior happens with probability represents the choice of the control actions for at least λ 1,the correct choice of the threshold λ can the controller (or planner), while the stochastic response of be still challenging, due to simplifications and imprecisions the system to control actions is represented by the probabilistic introduced during modeling.