Not enough data to create a plot.
Try a different view from the menu above.
Seipp, Jendrik
NL2Plan: Robust LLM-Driven Planning from Minimal Text Descriptions
Gestrin, Elliot, Kuhlmann, Marco, Seipp, Jendrik
Today's classical planners are powerful, but modeling input tasks in formats such as PDDL is tedious and error-prone. In contrast, planning with Large Language Models (LLMs) allows for almost any input text, but offers no guarantees on plan quality or even soundness. In an attempt to merge the best of these two approaches, some work has begun to use LLMs to automate parts of the PDDL creation process. However, these methods still require various degrees of expert input. We present NL2Plan, the first domain-agnostic offline LLM-driven planning system. NL2Plan uses an LLM to incrementally extract the necessary information from a short text prompt before creating a complete PDDL description of both the domain and the problem, which is finally solved by a classical planner. We evaluate NL2Plan on four planning domains and find that it solves 10 out of 15 tasks - a clear improvement over a plain chain-of-thought reasoning LLM approach, which only solves 2 tasks. Moreover, in two out of the five failure cases, instead of returning an invalid plan, NL2Plan reports that it failed to solve the task. In addition to using NL2Plan in end-to-end mode, users can inspect and correct all of its intermediate results, such as the PDDL representation, increasing explainability and making it an assistive tool for PDDL creation.
Numeric Reward Machines
Levina, Kristina, Pappas, Nikolaos, Karapantelakis, Athanasios, Feljan, Aneta Vulgarakis, Seipp, Jendrik
Reward machines inform reinforcement learning agents about the reward structure of the environment and often drastically speed up the learning process. However, reward machines only accept Boolean features such as robot-reached-gold. Consequently, many inherently numeric tasks cannot profit from the guidance offered by reward machines. To address this gap, we aim to extend reward machines with numeric features such as distance-to-gold. For this, we present two types of reward machines: numeric-Boolean and numeric. In a numeric-Boolean reward machine, distance-to-gold is emulated by two Boolean features distance-to-gold-decreased and robot-reached-gold. In a numeric reward machine, distance-to-gold is used directly alongside the Boolean feature robot-reached-gold. We compare our new approaches to a baseline reward machine in the Craft domain, where the numeric feature is the agent-to-target distance. We use cross-product Q-learning, Q-learning with counter-factual experiences, and the options framework for learning. Our experimental results show that our new approaches significantly outperform the baseline approach. Extending reward machines with numeric features opens up new possibilities of using reward machines in inherently numeric tasks.
Consolidating LAMA with Best-First Width Search
Corrรชa, Augusto B., Seipp, Jendrik
One key decision for heuristic search algorithms is how to balance exploration and exploitation. In classical planning, novelty search has come out as the most successful approach in this respect. The idea is to favor states that contain previously unseen facts when searching for a plan. This is done by maintaining a record of the tuples of facts observed in previous states. Then the novelty of a state is the size of the smallest previously unseen tuple. The most successful version of novelty search is best-first width search (BFWS), which combines novelty measures with heuristic estimates. An orthogonal approach to balance exploration-exploitation is to use several open-lists. These open-lists are ordered using different heuristic estimates, which diversify the information used in the search. The search algorithm then alternates between these open-lists, trying to exploit these different estimates. This is the approach used by LAMA, a classical planner that, a decade after its release, is still considered state-of-the-art in agile planning. In this paper, we study how to combine LAMA and BFWS. We show that simply adding the strongest open-list used in BFWS to LAMA harms performance. However, we show that combining only parts of each planner leads to a new state-of-the-art agile planner.
Expressing and Exploiting the Common Subgoal Structure of Classical Planning Domains Using Sketches: Extended Version
Drexler, Dominik, Seipp, Jendrik, Geffner, Hector
Width-based planning methods exploit the use of conjunctive goals for decomposing problems into subproblems of low width. However, algorithms like SIW fail when the goal is not serializable. In this work, we address this limitation of SIW by using a simple but powerful language for expressing problem decompositions introduced recently by Bonet and Geffner, called policy sketches. A policy sketch R consists of a set of Boolean and numerical features and a set of sketch rules that express how the values of these features are supposed to change. Like general policies, policy sketches are domain general, but unlike policies, the changes captured by sketch rules do not need to be achieved in a single step. We show that many planning domains that cannot be solved by SIW are provably solvable in low polynomial time with the SIW_R algorithm, the version of SIW that employs user-provided policy sketches. Policy sketches are thus shown to be a powerful language for expressing domain-specific knowledge in a simple and compact way and a convenient alternative to languages such as HTNs or temporal logics. Furthermore, policy sketches make it easy to express general problem decompositions and prove key properties like their complexity and width.
Counterexample-Guided Cartesian Abstraction Refinement for Classical Planning
Seipp, Jendrik, Helmert, Malte
Counterexample-guided abstraction refinement (CEGAR) is a method for incrementally computing abstractions of transition systems. We propose a CEGAR algorithm for computing abstraction heuristics for optimal classical planning. Starting from a coarse abstraction of the planning task, we iteratively compute an optimal abstract solution, check if and why it fails for the concrete planning task and refine the abstraction so that the same failure cannot occur in future iterations. A key ingredient of our approach is a novel class of abstractions for classical planning tasks that admits efficient and very fine-grained refinement. Since a single abstraction usually cannot capture enough details of the planning task, we also introduce two methods for producing diverse sets of heuristics within this framework, one based on goal atoms, the other based on landmarks. In order to sum their heuristic estimates admissibly we introduce a new cost partitioning algorithm called saturated cost partitioning. We show that the resulting heuristics outperform other state-of-the-art abstraction heuristics in many benchmark domains.
Narrowing the Gap Between Saturated and Optimal Cost Partitioning for Classical Planning
Seipp, Jendrik (University of Basel) | Keller, Thomas (University of Basel) | Helmert, Malte (University of Basel)
In classical planning, cost partitioning is a method for admissibly combining a set of heuristic estimators by distributing operator costs among the heuristics. An optimal cost partitioning is often prohibitively expensive to compute. Saturated cost partitioning is an alternative that is much faster to compute and has been shown to offer high-quality heuristic guidance on Cartesian abstractions. However, its greedy nature makes it highly susceptible to the order in which the heuristics are considered. We show that searching in the space of orders leads to significantly better heuristic estimates than with previously considered orders. Moreover, using multiple orders leads to a heuristic that is significantly better informed than any single-order heuristic. In experiments with Cartesian abstractions, the resulting heuristic approximates the optimal cost partitioning very closely.
Automatic Configuration of Sequential Planning Portfolios
Seipp, Jendrik (University of Basel) | Sievers, Silvan (University of Basel) | Helmert, Malte (University of Basel) | Hutter, Frank (University of Freiburg)
Sequential planning portfolios exploit the complementary strengths of different planners. Similarly, automated algorithm configuration tools can customize parameterized planning algorithms for a given type of tasks. Although some work has been done towards combining portfolios and algorithm configuration, the problem of automatically generating a sequential planning portfolio from a parameterized planner for a given type of tasks is still largely unsolved. Here, we present Cedalion, a conceptually simple approach for this problem that greedily searches for the pair of parameter configuration and runtime which, when appended to the current portfolio, maximizes portfolio improvement per additional runtime spent. We show theoretically that Cedalion yields portfolios provably within a constant factor of optimal for the training set distribution. We evaluate Cedalion empirically by applying it to construct sequential planning portfolios based on component planners from the highly parameterized Fast Downward (FD) framework. Results for a broad range of planning settings demonstrate that -- without any knowledge of planning or FD -- Cedalion constructs sequential FD portfolios that rival, and in some cases substantially outperform, manually-built FD portfolios.
From Non-Negative to General Operator Cost Partitioning
Pommerening, Florian (University of Basel) | Helmert, Malte (University of Basel) | Rรถger, Gabriele (University of Basel) | Seipp, Jendrik (University of Basel)
Operator cost partitioning is a well-known technique to make admissible heuristics additive by distributing the operator costs among individual heuristics. Planning tasks are usually defined with non-negative operator costs and therefore it appears natural to demand the same for the distributed costs. We argue that this requirement is not necessary and demonstrate the benefit of using general cost partitioning. We show that LP heuristics for operator-counting constraints are cost-partitioned heuristics and that the state equation heuristic computes a cost partitioning over atomic projections. We also introduce a new family of potential heuristics and show their relationship to general cost partitioning.
Learning Portfolios of Automatically Tuned Planners
Seipp, Jendrik (Albert-Ludwigs-University Freiburg) | Braun, Manuel (Albert-Ludwigs-Universiy Freiburg) | Garimort, Johannes (Albert-Ludwigs-University Freiburg) | Helmert, Malte (University of Basel)
Portfolio planners and parameter tuning are two ideas that have recently attracted significant attention in the domain-independent planning community. We combine these two ideas and present a portfolio planner that runs automatically configured planners. We let the automatic parameter tuning framework ParamILS find fast configurations of the Fast Downward planning system for a number of planning domains. Afterwards we learn a portfolio of those planner configurations. Evaluation of our portfolio planner on the IPC 2011 domains shows that it has a significantly higher IPC score than the winner of the sequential satisficing track.