Goto

Collaborating Authors

 Country


The Scanalyzer Domain: Greenhouse Logistics as a Planning Problem

AAAI Conferences

We introduce the Scanalyzer planning domain, a domain for classical planning which models the problem of automatic greenhouse logistic management. At its mathematical core, the Scanalyzer domain is a permutation problem with striking similarities to common search benchmarks such as Rubik's Cube or TopSpin. At the same time, it is also a real application domain, and efficient algorithms for the problem are of considerable practical interest. The Scanalyzer domain was used as a benchmark for sequential planners at the last International Planning Competition. The competition results show that domain-independent automated planners can find solutions of comparable quality to those generated by specialized algorithms developed by domain experts, while being considerably more flexible.


Shopper: A System for Executing and Simulating Expressive Plans

AAAI Conferences

We present Shopper, a plan execution engine that facilitates experimental evaluation of plans and makes it easier for planning researchers to incorporate replanning. Shopper interprets the LTML plan language, which extends PDDL in two major ways: with more expressive control structures, and with support for semantic web services modeled on OWL-S. LTML's command structures include not only conventional ones such as branching, iteration, and procedure calls, but also features needed to handle HTN plans, such as precondition-filtered method choice. Unlike conventional programming languages, LTML supports interaction with the agent's belief store, so that its execution semantics line up with those assumed by planners. LTML actions extend PDDL actions in having outputs as well as effects, which means that they can support actions that sense the world; an important special case of this is semantic web services, which reveal information about a state hidden from the agent. To support experimentation as well as action in the real world, Shopper accommodates multiple, swappable implementations of its primitive action API. For example, one may interact with real web services through SOAP and WSDL, or with simulated web services through local procedure calls. We describe novel features of LTML, the interpretation strategy, swappable back-ends, and the implementation.


A PDDL+ Benchmark Problem: The Batch Chemical Plant

AAAI Conferences

The PDDL+ language has been mainly devised to allow modelling of real-world systems, with continuous, time-dependant dynamics. Several interesting case studies with these characteristics have been also proposed, to test the language expressiveness and the capabilities of the support tools. However, most of these case studies have not been completely developed so far. In this paper we focus on the batch chemical plant case study, a very complex hybrid system with nonlinear dynamics that could represent a challenging benchmark problem for planning techniques and tools. We present a complete PDDL+ model for such system, and show an example application where the UPMurphi universal planner is used to generate a set of production policies for the plant.


When Policies Can Be Trusted: Analyzing a Criteria to Identify Optimal Policies in MDPs with Unknown Model Parameters

AAAI Conferences

Computing a good policy in stochastic uncertain environments with unknown dynamics and reward model parameters is a challenging task. In a number of domains, ranging from space robotics to epilepsy management, it may be possible to have an initial training period when suboptimal performance is permitted. For such problems it is important to be able to identify when this training period is complete, and the computed policy can be used with high confidence in its future performance. A simple principled criteria for identifying when training has completed is when the error bounds on the value estimates of the current policy are sufficiently small that the optimal policy is fixed, with high probability. We present an upper bound on the amount of training data required to identify the optimal policy as a function of the unknown separation gap between the optimal and the next-best policy values. We illustrate with several small problems that by estimating this gap in an online manner, the number of training samples to provably reach optimality can be significantly lower than predicted offline using a Probably Approximately Correct framework that requires an input epsilon parameter.


Iterative Learning of Weighted Rule Sets for Greedy Search

AAAI Conferences

Greedy search is commonly used in an attempt to generate solutions quickly at the expense of completeness and optimality. In this work, we consider learning sets of weighted action-selection rules for guiding greedy search with application to automated planning. We make two primary contributions over prior work on learning for greedy search. First, we introduce weighted sets of action-selection rules as a new form of control knowledge for greedy search. Prior work has shown the utility of action-selection rules for greedy search, but has treated the rules as hard constraints, resulting in brittleness. Our weighted rule sets allow multiple rules to vote, helping to improve robustness to noisy rules. Second, we give a new iterative learning algorithm for learning weighted rule sets based on RankBoost, an efficient boosting algorithm for ranking. Each iteration considers the actual performance of the current rule set and directs learning based on the observed search errors. This is in contrast to most prior approaches, which learn control knowledge independently of the search process. Our empirical results have shown significant promise for this approach in a number of domains.


Choosing Path Replanning Strategies for Unmanned Aircraft Systems

AAAI Conferences

Unmanned aircraft systems use a variety of techniques to plan collision-free flight paths given a map of obstacles and no-fly zones. However, maps are not perfect and obstacles may change over time or be detected during flight, which may invalidate paths that the aircraft is already following. Thus, dynamic in-flight replanning is required. Numerous strategies can be used for replanning, where the time requirements and the plan quality associated with each strategy depend on the environment around the original flight path. In this paper, we investigate the use of machine learning techniques, in particular support vector machines, to choose the best possible replanning strategy depending on the amount of time available. The system has been implemented, integrated and tested in hardware-in-the-loop simulation with a Yamaha RMAX helicopter platform.


Influence-Based Policy Abstraction for Weakly-Coupled Dec-POMDPs

AAAI Conferences

Decentralized POMDPs are powerful theoretical models for coordinating agents’ decisions in uncertain environments, but the generally-intractable complexity of optimal joint policy construction presents a significant obstacle in applying Dec-POMDPs to problems where many agents face many policy choices. Here, we argue that when most agent choices are independent of other agents’ choices, much of this complexity can be avoided: instead of coordinating full policies, agents need only coordinate policy abstractions that explicitly convey the essential interaction influences. To this end, we develop a novel framework for influence-based policy abstraction for weakly-coupled transition-dependent Dec-POMDP problems that subsumes several existing approaches. In addition to formally characterizing the space of transition-dependent influences, we provide a method for computing optimal and approximately-optimal joint policies. We present an initial empirical analysis, over problems with commonly-studied flavors of transition-dependent influences, that demonstrates the potential computational benefits of influence-based abstraction over state-of-the-art optimal policy search methods.


Simultaneously Searching with Multiple Settings: An Alternative to Parameter Tuning for Suboptimal Single-Agent Search Algorithms

AAAI Conferences

Many search algorithms have parameters that need to be tuned to get the best performance. Typically, the parameters are tuned offline, resulting in a generic setting that is supposed to be effective on all problem instances. For suboptimal single-agent search, problem-instance-specific parameter settings can result in substantially reduced search effort. We consider the use of dovetailing as a way to take advantage of this fact. Dovetailing is a procedure that performs search with multiple parameter settings simultaneously. Dovetailing is shown to improve the search speed of weighted IDA* by several orders of magnitude and to generally enhance the performance of weighted RBFS. This procedure is trivially parallelizable and is shown to be an effective form of parallelization for WA* and BULB. In particular, using WA* with parallel dovetailing yields good speedups in the sliding-tile puzzle domain, and increases the number of problems solved when used in an automated planning system.


A New Approach to Conformant Planning Using CNF∗

AAAI Conferences

In this paper, we develop a heuristic, progression based conformant planner, called CNF, which represents belief states by a special type of CNF formulae, called CNF CNF-state. We define a transition function φ CNF for computing the successor belief state resulting from the execution of an action in a belief state and prove that it is sound and complete with respect to the complete semantics defined in the literature for conformant planning. We evaluate the performance of CNF against other state-of-the-art conformant planners and identify the classes of problems where CNF is comparable with other state-of-the-art planners or scales up better than other planners. We also develop a technique called oneof relaxation which helps boost the performance of CNF. We characterize the domains where this technique can be applied and validate this idea by proposing a new set of benchmarks that is really difficult for other planners yet easy for CNF.


Computing Applicability Conditions for Plans with Loops

AAAI Conferences

The utility of including loops in plans has been long recognized by the planning community. Loops in a plan help increase both its applicability and the compactness of representation. However, progress in finding such plans has been limited largely due to lack of methods for reasoning about the correctness and safety properties of loops of actions. We present novel algorithms for determining the applicability and progress made by a general class of loops of actions. These methods can be used for directing the search for plans with loops towards greater applicability while guaranteeing termination, as well as in post-processing of computed plans to precisely characterize their applicability. Experimental results demonstrate the efficiency of these algorithms.