Goto

Collaborating Authors

 Genre


Improving Determinization in Hindsight for On-line Probabilistic Planning

AAAI Conferences

Recently, "determinization in hindsight" has enjoyed surprising success in on-line probabilistic planning. This technique evaluates the actions available in the current state by using non-probabilistic planning in deterministic approximations of the original domain. Although the approach has proven itself effective in many challenging domains, it is computationally very expensive. In this paper, we present three significant improvements to help mitigate this expense. First, we use a method for detecting potentially useful actions, allowing us to avoid estimating the values of unnecessary ones. Second, we exploit determinism in the domain by reusing relevant plans rather than computing new ones. Third, we improve action evaluation by increasing the chance that at least one determin- istic plan reaches a goal. Taken together, these improvements allow determinization in hindsight to scale significantly better on large or mostly-deterministic problems.


The More, the Merrier: Combining Heuristic Estimators for Satisficing Planning

AAAI Conferences

We empirically examine several ways of exploiting the information of multiple heuristics in a satisficing best-first search algorithm, comparing their performance in terms of coverage, plan quality, speed, and search guidance. Our results indicate that using multiple heuristics for satisficing search is indeed useful. Among the combination methods we consider, the best results are obtained by the alternation method of the "Fast Diagonally Downward" planner.


Constraint Propagation in Propositional Planning

AAAI Conferences

Planning as Satisfiability is a most successful approach to optimal propositional planning. It draws its strength from the efficiency of state-of-the-art propositional satisfiability solvers, combined with the utilization of constraints that are inferred from the problem planning graph. One of the recent improvements of the framework is the addition of long-distance mutual exclusion (londex) constraints that relate facts and actions which refer to different time steps. In this paper we compare different encodings of planning as satisfiability wrt the constraint propagation they achieve in a modern SAT solver. This analysis explains some of the differences observed in the performance of different encodings, and leads to some interesting conclusions. For instance, the Blackbox encoding achieves more propagation than the one of Satplan06, and therefore is a stronger formulation of planning as satisfiability. Moreover, our investigation suggests a new more compact and stronger model for the problem. We prove that in this new formulation many of the londex constraints are redundant in the sense that they do not add anything to the constraint propagation achieved by the model. Experimental results suggest that the theoretical results obtained are practically relevant.


Handling Goal Utility Dependencies in a Satisfiability Framework

AAAI Conferences

Goal utility dependencies arise when the utility of achieving a goal depends on the other goals that are achieved with it. This complicates the planning procedure because achieving a new goal can potentially alter the utilities of all the other goals currently achieved. In this paper, we present an encoding procedure that enables general-purpose Max-SAT solvers to be used to solve planning problems with goal utility dependencies. We compare this approach to one using integer programming via an empirical evaluation using benchmark problems from past international planning competitions. Our results indicate that this approach is competitive and sometimes more successful than an integer programming one -- solving two to three times more subproblems in some domains, while being outperformed by only a significantly smaller margin in others.


The Joy of Forgetting: Faster Anytime Search via Restarting

AAAI Conferences

Anytime search algorithms solve optimisation problems by quickly finding a (usually suboptimal) first solution and then finding improved solutions when given additional time. To deliver an initial solution quickly, they are typically greedy with respect to the heuristic cost-to-go estimate h. In this paper, we show that this low-h bias can cause poor performance if the greedy search makes early mistakes. Building on this observation, we present a new anytime approach that restarts the search from the initial state every time a new solution is found. We demonstrate the utility of our method via experiments in PDDL planning as well as other domains, and show that it is particularly useful for problems where the heuristic has systematic errors.


Incrementally Solving STNs by Enforcing Partial Path Consistency

AAAI Conferences

Efficient management and propagation of temporal constraints is important for temporal planning as well as for scheduling. During plan development, new events and temporal constraints are added and existing constraints may be tightened; the consistency of the whole temporal network is frequently checked; and results of constraint propagation guide further search. Recent work shows that enforcing partial path consistency provides an efficient means of propagating temporal information for the popular Simple Temporal Network (STN). We show that partial path consistency can be enforced incrementally, thus exploiting the similarities of the constraint network between subsequent edge tightenings. We prove that the worst-case time complexity of our algorithm can be bounded both by the number of edges in the chordal graph (which is better than the previous bound of the number of vertices squared), and by the degree of the chordal graph times the number of vertices incident on updated edges. We show that for many sparse graphs, the latter bound is better than that of the previously best-known approaches. In addition, our algorithm requires space only linear in the number of edges of the chordal graph, whereas earlier work uses space quadratic in the number of vertices. Finally, empirical results show that when incrementally solving sparse STNs, stemming from problems such as Hierarchical Task Network planning, our approach outperforms extant algorithms.


Partially Informed Depth-First Search for the Job Shop Problem

AAAI Conferences

We propose a partially informed depth-first search algorithm to cope with the Job Shop Scheduling Problem with makespan minimization. The algorithm is built from the well-known P. Brucker's branch and bound algorithm. We improved the heuristic estimation of Brucker's algorithm by means of constraint propagation rules and so devised a more informed heuristic which is proved to be monotonic. We conducted an experimental study across medium and large instances. The results show that the proposed algorithm reaches optimal solutions for medium instances taking less time than branch and bound and that for large instances it reaches much better lower and upper bounds when both algorithms are given the same amount of time.


Classical Planning in MDP Heuristics: with a Little Help from Generalization

AAAI Conferences

Computing a good policy in stochastic uncertain environments with unknown dynamics and reward model parameters is a challenging task. In a number of domains, ranging from space robotics to epilepsy management, it may be possible to have an initial training period when suboptimal performance is permitted. For such problems it is important to be able to identify when this training period is complete, and the computed policy can be used with high confidence in its future performance. A simple principled criteria for identifying when training has completed is when the error bounds on the value estimates of the current policy are sufficiently small that the optimal policy is fixed, with high probability. We present an upper bound on the amount of training data required to identify the optimal policy as a function of the unknown separation gap between the optimal and the next-best policy values. We illustrate with several small problems that by estimating this gap in an online manner, the number of training samples to provably reach optimality can be significantly lower than predicted offline using a Probably Approximately Correct framework that requires an input epsilon parameter.


Self-Taught Decision Theoretic Planning with First Order Decision Diagrams

AAAI Conferences

We present a new paradigm for planning by learning, where the planner is given a model of the world and  a small set of states of interest, but no indication of optimal actions in these states. The additional information can help focus the planner on regions of the state space that are of interest and lead to improved performance. We demonstrate this idea by introducing novel model-checking reduction operations for First Order Decision Diagrams (FODD), a representation that has been used to implement decision-theoretic planning with Relational Markov Decision Processes (RMDP). Intuitively, these reductions modify the construction of the value function by removing any complex specifications that are irrelevant to the set of training examples, thereby focusing on the region of interest. We show that such training examples can be constructed on the fly from a description of the planning problem thus we can bootstrap to get a self-taught planning system. Additionally, we provide a new heuristic to embed universal and conjunctive goals within the framework of RMDP planners, expanding the scope and applicability of such systems. We show that these ideas lead to significant improvements in performance in terms of both speed and coverage of the planner, yielding state of the art planning performance on problems from the International Planning Competition.


Towards Finding Robust Execution Strategies for RCPSP/max with Durational Uncertainty

AAAI Conferences

Resource Constrained Project Scheduling Problems with minimum and maximum time lags (RCPSP/max) have been studied extensively in the literature. However, the more realistic RCPSP/max problems — ones where durations of activities are not known with certainty – have received scant interest and hence are the main focus of the paper. Towards addressing the significant computational complexity involved in tackling RCPSP/max with durational uncertainty, we employ a local search mechanism to generate robust schedules. In this regard, we make two key contributions: (a) Introducing and studying the key properties of a new decision rule to specify start times of activities with respect to dynamic realizations of the duration uncertainty; and (b) Deriving the fitness function that is used to guide the local search towards robust schedules. Experimental results show that the performance of local search is improved with the new fitness evaluation over the best known existing approach.