Lipovetzky, Nir (University of Melbourne) | Muise, Christian (Massachusetts Institute of Technology) | Geffner, Hector (Universitat Pompeu Fabra)

We consider the problem of deriving formulas that capture traps, invariants, and dead-ends in classical planning through polynomial forms of preprocessing. An invariant is a formula that is true in the initial state and in all reachable states. A trap is a conditional invariant: once a state is reached that makes the trap true, all the states that are reachable from it will sat- isfy the trap formula as well. Finally, dead-ends are formulas that are satisfied in states that make the goal unreachable. We introduce a preprocessing algorithm that computes traps in k- DNF form that is exponential in the k parameter, and show how the algorithm can be used to precompute invariants and dead-ends. We report also preliminary tests that illustrate the effectiveness of the preprocessing algorithm for identifying dead-end states, and compare it with the identification that follows from the use of the h1 and h2 heuristics that cannot be preprocessed, and must be computed at run time.

Bonet, Blai (Universidad Simon Bolivar) | Geffner, Hector (ICREA and Universitat Pompeu Fabra)

We establish conditions under which memoryless policies and finite-state controllers that solve one partially observable non-deterministic problem (PONDP) generalize to other problems; namely, problems that have a similar structure and share the same action and observation space. This is relevant to generalized planning where plans that work for many problems are sought, and to transfer learning where knowledge gained in the solution of one problem is to be used on related problems. We use a logical setting where uncertainty is represented by sets of states and the goal is to be achieved with certainty. While this gives us crisp notions of solution policies and generalization, the account also applies to probabilistic PONDs, i.e., Goal POMDPs.

Keyder, Emil, Geffner, Hector

Soft goals extend the classical model of planning with a simple model of preferences. The best plans are then not the ones with least cost but the ones with maximum utility, where the utility of a plan is the sum of the utilities of the soft goals achieved minus the plan cost. Finding plans with high utility appears to involve two linked problems: choosing a subset of soft goals to achieve and finding a low-cost plan to achieve them. New search algorithms and heuristics have been developed for planning with soft goals, and a new track has been introduced in the International Planning Competition (IPC) to test their performance. In this note, we show however that these extensions are not needed: soft goals do not increase the expressive power of the basic model of planning with action costs, as they can easily be compiled away. We apply this compilation to the problems of the net-benefit track of the most recent IPC, and show that optimal and satisficing cost-based planners do better on the compiled problems than optimal and satisficing net-benefit planners on the original problems with explicit soft goals. Furthermore, we show that penalties, or negative preferences expressing conditions to avoid, can also be compiled away using a similar idea.

Palacios, Hector, Geffner, Hector

Conformant planning is the problem of finding a sequence of actions for achieving a goal in the presence of uncertainty in the initial state or action effects. The problem has been approached as a path-finding problem in belief space where good belief representations and heuristics are critical for scaling up. In this work, a different formulation is introduced for conformant problems with deterministic actions where they are automatically converted into classical ones and solved by an off-the-shelf classical planner. The translation maps literals L and sets of assumptions t about the initial situation, into new literals KL/t that represent that L must be true if t is initially true. We lay out a general translation scheme that is sound and establish the conditions under which the translation is also complete. We show that the complexity of the complete translation is exponential in a parameter of the problem called the conformant width, which for most benchmarks is bounded. The planner based on this translation exhibits good performance in comparison with existing planners, and is the basis for T0, the best performing planner in the Conformant Track of the 2006 International Planning Competition.

Bonet, Blai (Universidad Simon Bolivar) | Geffner, Hector (ICREA and Universitat Pompeu Fabra)

Belief tracking is a basic problem in planning with sensing. While the problem is intractable, it has been recently shown that for both deterministic and non-deterministic systems expressed in compact form, it can be done in time and space that are exponential in the problem width. The width measures the maximum number of state variables that are all relevant to a given precondition or goal. In this work, we extend this result both theoretically and practically. First, we introduce an alternative decomposition scheme and algorithm with the same time complexity but different completeness guarantees, whose space complexity is much smaller: exponential in the causal width of the problem that measures the number of state variables that are causally relevant to a given precondition, goal, or observable. Second, we introduce a fast, meaningful, and powerful approximation that trades completeness by speed, and is both time and space exponential in the problem causal width. It is then shown empirically that the algorithm combined with simple heuristics yields state-of-the-art real-time performance in domains with high widths but low causal widths such as Minesweeper, Battleship, and Wumpus.

Patrizi, Fabio (Università di Roma "La Sapienza") | Lipovetzky, Nir (The University of Melbourne) | Geffner, Hector ( Institucion Catalana de Recerca i Estudis Avançats Universitat Pompeu Fabra )

We consider the problem of planning in environments where the state is fully observable, actions have non-deterministic effects, and plans must generate infinite state trajectories for achieving a large class of LTL goals. More formally, we focus on the control synthesis problem under the assumption that the LTL formula to be realized can be mapped into a deterministic Bu ̈chi automaton. We show that by assuming that action non-determinism is fair, namely that infinite executions of a non-deterministic action in the same state yield each possible successor state an infinite number of times, the (fair) synthesis problem can be reduced to a standard strong cyclic planning task over reachability goals. Since strong cyclic planners are built on top of efficient classical planners, the transformation reduces the non-deterministic, fully observable, temporally extended planning task into the solution of classical planning problems. A number of experiments are reported showing the potential benefits of this approach to synthesis in comparison with state-of-the-art symbolic methods.

Bonet, Blai, Geffner, Hector

We develop a qualitative model of decision making with two aims: to describe how people make simple decisions and to enable computer programs to do the same. Current approaches based on Planning or Decisions Theory either ignore uncertainty and tradeoffs, or provide languages and algorithms that are too complex for this task. The proposed model provides a language based on rules, a semantics based on high probabilities and lexicographical preferences, and a transparent decision procedure where reasons for and against decisions interact. The model is no substitude for Decision Theory, yet for decisions that people find easy to explain it may provide an appealing alternative.

Srivastava, Siddharth (University of Massachusetts, Amherst) | Zilberstein, Shlomo (University of Massachusetts, Amherst) | Immerman, Neil (University of Massachusetts, Amherst) | Geffner, Hector (ICREA and Universitat Pompeu Fabra)

We consider a new class of planning problems involving a set of non-negative real variables, and a set of non-deterministic actions that increase or decrease the values of these variables by some arbitrary amount. The formulas specifying the initial state, goal state, or action preconditions can only assert whether certain variables are equal to zero or not. Assuming that the state of the variables is fully observable, we obtain two results. First, the solution to the problem can be expressed as a policy mapping qualitative states into actions, where a qualitative state includes a Boolean variable for each original variable, indicating whether its value is zero or not. Second, testing whether any such policy, that may express nested loops of actions, is a solution to the problem, can be determined in time that is polynomial in the qualitative state space, which is much smaller than the original infinite state space. We also report experimental results using a simple generate-and-test planner to illustrate these findings.

Ramírez, Miquel (Universitat Pompeu Fabra) | Geffner, Hector (ICREA and Universitat Pompeu Fabra)

Plan recognition is the problem of inferring the goals and plans of an agent from partial observations of her behavior. Recently, it has been shown that the problem can be formulated and solved using planners, reducing plan recognition to plan generation. In this work, we extend this model-based approach to plan recognition to the POMDP setting, where actions are stochastic and states are partially observable. The task is to infer a probability distribution over the possible goals of an agent whose behavior results from a POMDP model. The POMDP model is shared between agent and observer except for the true goal of the agent that is hidden to the observer. The observations are action sequences O that may contain gaps as some or even most of the actions done by the agent may not be observed. We show that the posterior goal distribution P ( G | O ) can be computed from the value function V G ( b ) over beliefs b generated by the POMDP planner for each possible goal G. Some extensions of the basic framework are discussed, and a number of experiments are reported.

Bonet, Blai (Universidad Simon Bolivar) | Geffner, Hector (ICREA and Universitat Pompeu Fabra)

Planning with partial observability can be formulated as a non-deterministic search problem in belief space. The problem is harder than classical planning as keeping track of beliefs is harder than keeping track of states, and searching for action policies is harder than searching for action sequences. In this work, we develop a framework for partial observability that avoids these limitations and leads to a planner that scales up to larger problems. For this, the class of problems is restricted to those in which 1) the non-unary clauses representing the uncertainty about the initial situation are nvariant, and 2) variables that are hidden in the initial situation do not appear in the body of conditional effects, which are all assumed to be deterministic. We show that such problems can be translated in linear time into equivalent fully observable non-deterministic planning problems, and that an slight extension of this translation renders the problem solvable by means of classical planners. The whole approach is sound and complete provided that in addition, the state-space is connected. Experiments are also reported.