Planning & Scheduling
Resource-Constrained Planning: A Monte Carlo Random Walk Approach
Nakhost, Hootan (University of Alberta) | Hoffmann, Joerg (Saarland University) | Mueller, Martin (University of Alberta)
The need to economize limited resources, such as fuel or money, is aubiquitous feature of planning problems. If the resources cannot bereplenished, the planner must make do with the initial supply. It isthen of paramount importance how constrained the problem is,i.e., whether and to which extent the initial resource supply exceedsthe minimum need. While there is a large body of literature on numericplanning and planning with resources, such resource constrainednesshas only been scantily investigated. We herein start to address thisin more detail. We generalize the previous notion of resourceconstrainedness, characterized through a numeric problem feature Cโฅ1 , to the case of multiple resources. We implement an extendedbenchmark suite controlling C . We conduct a large-scale study of thecurrent state of the art as a function of C , highlighting whichtechniques contribute to success. We introduce two new techniques ontop of a recent Monte Carlo Random Walk method, resulting in a plannerthat, in these benchmarks, outperforms previous planners whenresources are scarce ( C close to 1 ). We investigate the parametersinfluencing the performance of that planner, and we show that one ofthe two new techniques works well also on the regular IPC benchmarks.
Improved Non-Deterministic Planning by Exploiting State Relevance
Muise, Christian James (University of Toronto) | McIlraith, Sheila A. (University of Toronto) | Beck, Christopher (University of Toronto)
We address the problem of computing a policy for fully observable non-deterministic (FOND) planning problems. By focusing on the relevant aspects of the state of the world, we introduce a series of improvements to the previous state of the art and extend the applicability of our planner, PRP, to work in an online setting. The use of state relevance allows our policy to be exponentially more succinct in representing a solution to a FOND problem for some domains. Through the introduction of new techniques for avoiding deadends and determining sufficient validity conditions, PRP has the potential to compute a policy up to several orders of magnitude faster than previous approaches. We also find dramatic improvements over the state of the art in online replanning when we treat suitable probabilistic domains as FOND domains.
Reverse Iterative Deepening for Finite-Horizon MDPs with Large Branching Factors
Kolobov, Andrey (University of Washington, Seattle) | Dai, Peng (Google Inc.) | Mausam, Mausam (University of Washington, Seattle) | Weld, Daniel S. (University of Washington, Seattle)
In contrast to previous competitions, where the problems were goal-based, the 2011 International Probabilistic Planning Competition (IPPC-2011) emphasized finite-horizon reward maximization problems with large branching factors. These MDPs modeled more realistic planning scenarios and presented challenges to the previous state-of-the-art planners (e.g., those from IPPC-2008), which were primarily based on domain determinization โ a technique more suited to goal-oriented MDPs with small branching factors. Moreover, large branching factors render the existing implementations of RTDP- and LAO-style algorithms inefficient as well. In this paper we present GLUTTON, our planner at IPPC-2011 that performed well on these challenging MDPs. The main algorithm used by GLUTTON is LR2TDP, an LRTDP-based optimal algorithm for finite-horizon problems centered around the novel idea of reverse iterative deepening. We detail LR2TDP itself as well as a series of optimizations included in GLUTTON that help LR2TDP achieve competitive performance on difficult problems with large branching factors -- subsampling the transition function, separating out natural dynamics, caching transition function samples, and others. Experiments show that GLUTTON and PROST, the IPPC-2011 winner, have complementary strengths, with GLUTTON demonstrating superior performance on problems with few high-reward terminal states.
Integrating Vehicle Routing and Motion Planning
Kiesel, Scott (University of New Hampshire) | Burns, Ethan (University of New Hampshire) | Wilt, Christopher (University of New Hampshire) | Ruml, Wheeler (University of New Hampshire)
There has been much interest recently in problems that com-bine high-level task planning with low-level motion planning.In this paper, we present a problem of this kind that arises inmulti-vehicle mission planning. It tightly integrates task al-location and scheduling, who will do what when, with pathplanning, how each task will actually be performed. It ex-tends classical vehicle routing in that the cost of executing aset of high-level tasks can vary significantly in time and costaccording to the low-level paths selected. It extends classi-cal motion planning in that each path must minimize costwhile also respecting temporal constraints, including thoseimposed by the agentโs other tasks and the tasks assigned toother agents. Furthermore, the problem is a subtask withinan interactive system and therefore must operate within se-vere time constraints. We present an approach to the problembased on a combination of tabu search, linear programming,and heuristic search. We evaluate our planner on represen-tative problem instances and find that its performance meetsthe demanding requirements of our application. These resultsdemonstrate how integrating multiple diverse techniques cansuccessfully solve challenging real-world planning problemsthat are beyond the reach of any single method.
Semi-Relaxed Plan Heuristics
Keyder, Emil Ragip (INRIA) | Hoffmann, Joerg (Saarland University) | Haslum, Patrik (The Australian National University and NICTA)
Heuristics based on the delete relaxation are at the forefront of modern domain-independent planning techniques. Here we introduce a principled and flexible technique for augmenting delete-relaxed tasks with a limited amount of delete information, by introducing special fluents that explicitly represent conjunctions of fluents in the original planning task. Differently from previous work in this direction, conditional effects are used to limit the growth of the task to be linear, rather than exponential, in the number of conjunctions that are introduced, making its use for obtaining heuristic functions feasible. We discuss how to obtain an informative set of conjunctions to be represented explicitly, and analyze and extend existing methods for relaxed planning in the presence of conditional effects. The resulting heuristics are empirically evaluated, and shown to be sometimes much more informative than standard delete-relaxation heuristics.
PROST: Probabilistic Planning Based on UCT
Keller, Thomas (University of Freiburg) | Eyerich, Patrick (University of Freiburg)
We present PROST, a probabilistic planning system that is based on the UCT algorithm by Kocsis and Szepesvari (2006), which has been applied successfully to many areas of planning and acting under uncertainty. The objective of this paper is to show the application of UCT to domain- independent probabilistic planning, an area it had not been applied to before. We furthermore present several enhance- ments to the algorithm, including a method that is able to drastically reduce the branching factor by identifying super- fluous actions. We show how search depth limitation leads to a more thoroughly investigated search space in parts that are influential on the quality of a policy, and present a sound and polynomially computable detection of reward locks, states that correspond to, e.g., dead ends or goals. We describe a general Q-value initialization for unvisited nodes in the search tree that circumvents the initial random walks inher- ent to UCT, and leads to a faster convergence on average. We demonstrate the significant influence of the enhancements by providing a comparison on the IPPC benchmark domains.
CP and MIP Methods for Ship Scheduling with Time-Varying Draft
Kelareva, Elena (Australian National University and NICTA) | Brand, Sebastian (University of Melbourne and NICTA) | Kilby, Philip (Australian National University and NICTA) | Thiebaux, Sylvie (Australian National University and NICTA) | Wallace, Mark (Monash University and NICTA)
Existing ship scheduling approaches either ignore constraints on ship draft (distance between the waterline and the keel), or model these in very simple ways, such as a constant draft limit that does not change with time. However, in most ports the draft restriction changes over time due to variation in environmental conditions. More accurate consideration of draft constraints would allow more cargo to be scheduled for transport on the same set of ships. We present constraint programming (CP) and mixed integer programming (MIP) models for the problem of scheduling ships at a port with time-varying draft constraints so as to optimise cargo throughput at the port. We also investigate the effect of several variations to the CP model, including a model containing sequence variables, and a model with ordered inputs. Our model allows us to solve realistic instances of the problem to optimality in a very short time, and produces better schedules than both scheduling with constant draft, and manual scheduling approaches used in practice at ports.
How to Relax a Bisimulation?
Katz, Michael (Saarland University) | Hoffmann, Joerg (Saarland University) | Helmert, Malte (University of Basel)
Merge-and-shrink abstraction (M&S) is an approach for constructing admissible heuristic functions for cost-optimal planning. It enables the targeted design of abstractions, by allowing to choose individual pairs of (abstract) states to aggregate into one. A key question is how to actually make these choices, so as to obtain an informed heuristic at reasonable computational cost. Recent work has addressed this via the well-known notion of bisimulation. When aggregating only bisimilar states -- essentially, states whose behavior is identical under every planning operator -- M&S yields a perfect heuristic. However, bisimulations are typically exponentially large. Thus we must relax the bisimulation criterion, so that it applies to more state pairs, and yields smaller abstractions. We herein devise a fine-grained method for doing so. We restrict the bisimulation criterion to consider only a subset K of the planning operators. We show that, if K is chosen appropriately, then M&S still yields a perfect heuristic, while abstraction size may decrease exponentially. Designing practical approximations for K, we obtain M&S heuristics that are competitive with the state of the art.
Risk-Variant Policy Switching to Exceed Reward Thresholds
Kane, Breelyn Melissa (Carnegie Mellon University) | Simmons, Reid (Carnegie Mellon University)
This paper presents a decision-theoretic planning approach for probabilistic environments where the agent's goal is to win, which we model as maximizing the probability of being above a given reward threshold. In competitive domains, second is as good as last, and it is often desirable to take risks if one is in danger of losing, even if the risk does not pay off very often. Our algorithm maximizes the probability of being above a particular reward threshold by dynamically switching between a suite of policies, each of which encodes a different level of risk. This method does not explicitly encode time or reward into the state space, and decides when to switch between policies during each execution step. We compare a risk-neutral policy to switching among different risk-sensitive policies, and show that our approach improves the agent's probability of winning.
Incremental Lower Bounds for Additive Cost Planning Problems
Haslum, Patrik (Australian National University and NICTA)
We present a novel method for computing increasing lower bounds on the cost of solving planning problems, based on repeatedly solving and strengthening the delete relaxation of the problem. Strengthening is done by compiling select conjunctions into new atoms, similar to the P m * construction. Because it does not rely on search in the state space, this method does not suffer some of the weaknesses of admissible search algorithms and therefore is able to prove higher lower bounds for many problems that are too hard for optimal planners to solve, thus narrowing the gap between lower bound and cost of the best known plan, providing better assurances of plan quality.