Not enough data to create a plot.
Try a different view from the menu above.
Boutilier, Craig
Structured Arc Reversal and Simulation of Dynamic Probabilistic Networks
Cheuk, Adrian Y. W., Boutilier, Craig
We present an algorithm for arc reversal in Bayesian networks with tree-structured conditional probability tables, and consider some of its advantages, especially for the simulation of dynamic probabilistic networks. In particular, the method allows one to produce CPTs for nodes involved in the reversal that exploit regularities in the conditional distributions. We argue that this approach alleviates some of the overhead associated with arc reversal, plays an important role in evidence integration and can be used to restrict sampling of variables in DPNs. We also provide an algorithm that detects the dynamic irrelevance of state variables in forward simulation. This algorithm exploits the structured CPTs in a reversed network to determine, in a time-independent fashion, the conditions under which a variable does or does not need to be sampled.
Correlated Action Effects in Decision Theoretic Regression
Boutilier, Craig
Much recent research in decision theoretic planning has adopted Markov decision processes (MDPs) as the model of choice, and has attempted to make their solution more tractable by exploiting problem structure. One particular algorithm, structured policy construction achieves this by means of a decision theoretic analog of goal regression using action descriptions based on Bayesian networks with tree-structured conditional probability tables. The algorithm as presented is not able to deal with actions with correlated effects. We describe a new decision theoretic regression operator that corrects this weakness. While conceptually straightforward, this extension requires a somewhat more complicated technical approach.
Hierarchical Solution of Markov Decision Processes using Macro-actions
Hauskrecht, Milos, Meuleau, Nicolas, Kaelbling, Leslie Pack, Dean, Thomas L., Boutilier, Craig
We investigate the use of temporally abstract actions, or macro-actions, in the solution of Markov decision processes. Unlike current models that combine both primitive actions and macro-actions and leave the state space unchanged, we propose a hierarchical model (using an abstract MDP) that works with macro-actions only, and that significantly reduces the size of the state space. This is achieved by treating macroactions as local policies that act in certain regions of state space, and by restricting states in the abstract MDP to those at the boundaries of regions. The abstract MDP approximates the original and can be solved more efficiently. We discuss several ways in which macro-actions can be generated to ensure good solution quality. Finally, we consider ways in which macro-actions can be reused to solve multiple, related MDPs; and we show that this can justify the computational overhead of macro-action generation.
Continuous Value Function Approximation for Sequential Bidding Policies
Boutilier, Craig, Goldszmidt, Moises, Sabata, Bikash
Market-based mechanisms such as auctions are being studied as an appropriate means for resource allocation in distributed and mulitagent decision problems. When agents value resources in combination rather than in isolation, they must often deliberate about appropriate bidding strategies for a sequence of auctions offering resources of interest. We briefly describe a discrete dynamic programming model for constructing appropriate bidding policies for resources exhibiting both complementarities and substitutability. We then introduce a continuous approximation of this model, assuming that money (or the numeraire good) is infinitely divisible. Though this has the potential to reduce the computational cost of computing policies, value functions in the transformed problem do not have a convenient closed form representation. We develop {em grid-based} approximation for such value functions, representing value functions using piecewise linear approximations. We show that these methods can offer significant computational savings with relatively small cost in solution quality.
SPUDD: Stochastic Planning using Decision Diagrams
Hoey, Jesse, St-Aubin, Robert, Hu, Alan, Boutilier, Craig
Markov decisions processes (MDPs) are becoming increasing popular as models of decision theoretic planning. While traditional dynamic programming methods perform well for problems with small state spaces, structured methods are needed for large problems. We propose and examine a value iteration algorithm for MDPs that uses algebraic decision diagrams(ADDs) to represent value functions and policies. An MDP is represented using Bayesian networks and ADDs and dynamic programming is applied directly to these ADDs. We demonstrate our method on large MDPs (up to 63 million states) and show that significant gains can be had when compared to tree-structured representations (with up to a thirty-fold reduction in the number of nodes required to represent optimal value functions).
Reasoning With Conditional Ceteris Paribus Preference Statem
Boutilier, Craig, Brafman, Ronen I., Hoos, Holger H., Poole, David L.
In many domains it is desirable to assess the preferences of users in a qualitative rather than quantitative way. Such representations of qualitative preference orderings form an importnat component of automated decision tools. We propose a graphical representation of preferences that reflects conditional dependence and independence of preference statements under a ceteris paribus (all else being equal) interpretation. Such a representation is ofetn compact and arguably natural. We describe several search algorithms for dominance testing based on this representation; these algorithms are quite effective, especially in specific network topologies, such as chain-and tree- structured networks, as well as polytrees.
Vector-space Analysis of Belief-state Approximation for POMDPs
Poupart, Pascal, Boutilier, Craig
We propose a new approach to value-directed belief state approximation for POMDPs. The value-directed model allows one to choose approximation methods for belief state monitoring that have a small impact on decision quality. Using a vector space analysis of the problem, we devise two new search procedures for selecting an approximation scheme that have much better computational properties than existing methods. Though these provide looser error bounds, we show empirically that they have a similar impact on decision quality in practice, and run up to two orders of magnitude more quickly.
Value-Directed Sampling Methods for POMDPs
Poupart, Pascal, Ortiz, Luis E., Boutilier, Craig
We consider the problem of approximate belief-state monitoring using particle filtering for the purposes of implementing a policy for a partially-observable Markov decision process (POMDP). While particle filtering has become a widely-used tool in AI for monitoring dynamical systems, rather scant attention has been paid to their use in the context of decision making. Assuming the existence of a value function, we derive error bounds on decision quality associated with filtering using importance sampling. We also describe an adaptive procedure that can be used to dynamically determine the number of samples required to meet specific error bounds. Empirical evidence is offered supporting this technique as a profitable means of directing sampling effort where it is needed to distinguish policies.
UCP-Networks: A Directed Graphical Representation of Conditional Utilities
Boutilier, Craig, Bacchus, Fahiem, Brafman, Ronen I.
We propose a new directed graphical representation of utility functions, called UCP-networks, that combines aspects of two existing graphical models: generalized additive models and CP-networks. The network decomposes a utility function into a number of additive factors, with the directionality of the arcs reflecting conditional dependence of preference statements - in the underlying (qualitative) preference ordering - under a {em ceteris paribus} (all else being equal) interpretation. This representation is arguably natural in many settings. Furthermore, the strong CP-semantics ensures that computation of optimization and dominance queries is very efficient. We also demonstrate the value of this representation in decision making. Finally, we describe an interactive elicitation procedure that takes advantage of the linear nature of the constraints on "`tradeoff weights" imposed by a UCP-network. This procedure allows the network to be refined until the regret of the decision with minimax regret (with respect to the incompletely specified utility function) falls below a specified threshold (e.g., the cost of further questioning.
Active Collaborative Filtering
Boutilier, Craig, Zemel, Richard S., Marlin, Benjamin
Collaborative filtering (CF) allows the preferences of multiple users to be pooled to make recommendations regarding unseen products. We consider in this paper the problem of online and interactive CF: given the current ratings associated with a user, what queries (new ratings) would most improve the quality of the recommendations made? We cast this terms of expected value of information (EVOI); but the online computational cost of computing optimal queries is prohibitive. We show how offline prototyping and computation of bounds on EVOI can be used to dramatically reduce the required online computation. The framework we develop is general, but we focus on derivations and empirical study in the specific case of the multiple-cause vector quantization model.