Universidad Simon Bolivar
Flexible and Scalable Partially Observable Planning with Linear Translations
Bonet, Blai (Universidad Simon Bolivar) | Geffner, Hector (ICREA and Universitat Pompeu Fabra)
The problem of on-line planning in partially observable settings involves two problems: keeping track of beliefs about the environment and selecting actions for achieving goals. While the two problems are computationally intractable in the worst case, significant progress has been achieved in recent years through the use of suitable reductions. In particular, the state-of-the-art CLG planner is based on a translation that maps deterministic partially observable problems into fully observable non-deterministic ones. The translation, which is quadratic in the number of problem fluents and gets rid of the belief tracking problem, is adequate for most benchmarks, and it is in fact complete for problems that have width 1. The more recent K-replanner uses translations that are linear, one for keeping track of beliefs and the other for selecting actions using off-the-shelf classical planners. As a result, the K-replanner scales up better but it is not as general. In this work, we combine the benefits of the two approaches - the scope of the CLG planner and the efficiency of the Kreplanner. The new planner, called LW1, is based on a translation that is linear but complete for width-1 problems. The scope and scalability of the new planner is evaluated experimentally by considering the existing benchmarks and new problems.
Action Selection for MDPs: Anytime AO* Versus UCT
Bonet, Blai (Universidad Simon Bolivar) | Geffner, Hector (ICREA and Universitat Pompeu Fabra)
One of the natural approaches for selecting actions in very From this perspective, an algorithm like RTDP fails on two large state spaces is by performing a limited amount of grounds: first, RTDP does not appear to make best use of lookahead. In the contexts of discounted MDPs, Kearns, short time windows in large state spaces; second, and more Mansour, and Ng have shown that near to optimal actions importantly, RTDP can use admissible heuristics but not informed can be selected by considering a sampled lookahead tree that base policies. On the other hand, algorithms like Policy is sufficiently sparse, whose size depends on the discount Iteration (Howard 1971), deliver all of these features except factor and the suboptimality bound but not on the number of one: they are exhaustive, and thus even to get started, problem states (Kearns, Mansour, and Ng 1999). The UCT they need vectors with the size of the state space. At the algorithm (Kocsis and Szepesvári 2006) is a version of this same time, while there are non-exhaustive versions of (asynchronous) form of Monte Carlo planning, where the lookahead trees Value Iteration such as RTDP, there are no similar are not grown depth-first but'best-first', following a selection'focused' versions of Policy Iteration ensuring anytime optimality.
Width and Complexity of Belief Tracking in Non-Deterministic Conformant and Contingent Planning
Bonet, Blai (Universidad Simon Bolivar) | Geffner, Hector (ICREA and Universitat Pompeu Fabra)
It has been shown recently that the complexity of belief tracking in deterministic conformant and contingent planning is exponential in a width parameter that is often bounded and small. In this work, we introduce a new width notion that applies to non-deterministic conformant and contingent problems as well. We also develop a belief tracking algorithm for non-deterministic problems that is exponential in the problem width, analyze the width of non-deterministic benchmarks, compare the new notion to the previous one over deterministic problems, and present experimental results.
A Complete Algorithm for Generating Landmarks
Bonet, Blai (Universidad Simon Bolivar) | Castillo, Julio (Universidad Simon Bolivar)
A collection of landmarks is complete if the cost of a minimum-cost hitting set equals h + and there is a minimum-cost hitting set that is an optimal relaxed plan. We present an algorithm for generating a complete collection of landmarks and we show that this algorithm can be extended into effective polytime heuristics for optimal and satisficing planning. The new admissible heuristics are compared with current state-of-the-art heuristics for optimal planning on benchmark problems from the IPC.
Automatic Polytime Reductions of NP Problems into a Fragment of STRIPS
Porco, Aldo (Universidad Simon Bolivar) | Machado, Alejandro (Universidad Simon Bolivar) | Bonet, Blai (Universidad Simon Bolivar)
We present a software tool that is able to automatically translate an NP problem into a STRIPS problem such that the former problem has a solution iff the latter has one, a solution for the latter can be transformed into a solution for the former, and all this can be done efficiently. Moreover, the tool is built such that it only produces problems that belong to a fragment of STRIPS that is solvable in non-deterministic polynomial time, a fact that guarantees that the whole approach is not an overkill. This tool has interesting applications. For example, with the advancement of planning technology, it can be used as an off-the-shelf method to solve general NP problems with the help of planners and to automatically generate benchmark problems of known complexity in a systematic and controlled manner. Another interesting contribution is related to the area of Knowledge Engineering in which one of the goals is to devise automatic methods for using the available planning technology to solve real-life problems.
Abstraction Heuristics Extended with Counting Abstractions
Bonet, Blai (Universidad Simon Bolivar)
State-of-the-art abstraction heuristics are those constructed by the merge-and-shrink approach in which an abstraction consists of a labeled transition system, and the composition of abstractions correspond to the synchronized product of transition systems. Merge-and-shrink heuristics build a composite abstraction from atomic abstractions that are directly associated with the variables of the planning problem. In this paper, we show that the framework of labeled transition systems is more general, and propose a new type of abstraction called the counting abstraction. Counting abstractions can be transparently combined with other type of abstractions to get more informative heuristics. We show how to effectively construct the counting abstractions and presents preliminary experiments over benchmark problems.
Automatic Derivation of Finite-State Machines for Behavior Control
Bonet, Blai (Universidad Simon Bolivar) | Palacios, Hector (Universidad Simon Bolivar) | Geffner, Hector (Universidad Pompeu Fabra &)
Finite-state controllers represent an effective action selection mechanisms widely used in domains such as video-games and mobile robotics. In contrast to the policies obtained from MDPs and POMDPs, finite-state controllers have two advantages: they are often extremely compact, and they are general, applying to many problems and not just one. A limitation of finite-state controllers, on the other hand, is that they are written by hand. In this paper, we address this limitation, presenting a method for deriving controllers automatically from models. The models represent a class of contingent problems where actions are deterministic and some fluents are observable. The problem of deriving a controller is converted into a conformant problem that is solved using classical planners, taking advantage of a complete translation into classical planning introduced recently. The controllers derived are ‘general’ in the sense that they do not solve the original problem only, but many variations as well, including changes in the size of the problem or in the uncertainty of the initial situation and action effects. Several experiments illustrating the automatic derivation of controllers are presented.