Goto

Collaborating Authors

 Country


A Complete Algorithm for Generating Landmarks

AAAI Conferences

A collection of landmarks is complete if the cost of a minimum-cost hitting set equals h + and there is a minimum-cost hitting set that is an optimal relaxed plan. We present an algorithm for generating a complete collection of landmarks and we show that this algorithm can be extended into effective polytime heuristics for optimal and satisficing planning. The new admissible heuristics are compared with current state-of-the-art heuristics for optimal planning on benchmark problems from the IPC.


Abstraction Heuristics Extended with Counting Abstractions

AAAI Conferences

State-of-the-art abstraction heuristics are those constructed by the merge-and-shrink approach in which an abstraction consists of a labeled transition system, and the composition of abstractions correspond to the synchronized product of transition systems. Merge-and-shrink heuristics build a composite abstraction from atomic abstractions that are directly associated with the variables of the planning problem. In this paper, we show that the framework of labeled transition systems is more general, and propose a new type of abstraction called the counting abstraction. Counting abstractions can be transparently combined with other type of abstractions to get more informative heuristics. We show how to effectively construct the counting abstractions and presents preliminary experiments over benchmark problems.


Cross-Domain Action-Model Acquisition for Planning via Web Search

AAAI Conferences

Applying learning techniques to acquire action models is an area of intense research interest. Most previous works in this area have assumed that there is a significant amount of training data available in a planning domain of interest, which we call target domain, where action models are to be learned. However, it is often difficult to acquire sufficient training data to ensure that the learned action models are of high quality. In this paper, we develop a novel approach to learning action models with limited training data in the target domain by transferring knowledge from related auxiliary or source domains. We assume that the action models in the source domains have already been created before, and seek to transfer as much of the the available information from the source domains as possible to help our learning task. We first exploit a Web searching method to bridge the target and source domains, such that transferrable knowledge from source domains is identified. We then encode the transferred knowledge together with the available data from the target domain as constraints in a maximum satisfiability problem, and solve these constraints using a weighted MAX-SAT solver. We finally transform the solutions thus obtained into high-quality target-domain action models. We empirically show that our transfer-learning based framework is effective in several domains, including the International Planning Competition (IPC) domains and some synthetic domains.


Dynamic State-Space Partitioning in External-Memory Graph Search

AAAI Conferences

The scalability of optimal sequential planning can be improved by using external-memory graph search. State-of-the-art external-memory graph search algorithms rely on a state-space projection function, or hash function, that partitions the stored nodes of the state-space search graph into groups of nodes that are stored as separate files on disk. Search performance depends on properties of the partition; whether the number of unique nodes in a file always fits in RAM, the number of files into which the nodes of the state-space graph are partitioned, and how well the partition captures local structure in the graph. Previous work relies on a static partition of the state space, but it can be difficult for a static partition to simultaneously satisfy all of these criteria. We introduce a method for dynamic partitioning and show that it leads to improved search performance in solving STRIPS planning problems.


Markov Decision Processes with Ordinal Rewards: Reference Point-Based Preferences

AAAI Conferences

In a standard Markov decision process (MDP), rewards are assumed to be precisely known and of quantitative nature. This can be a too strong hypothesis in some situations. When rewards can really be modeled numerically, specifying the reward function is often difficult as it is a cognitively-demanding and/or time-consuming task. Besides, rewards can sometimes be of qualitative nature as when they represent qualitative risk levels for instance. In those cases, it is problematic to use directly standard MDPs and we propose instead to resort to MDPs with ordinal rewards. Only a total order over rewards is assumed to be known. In this setting, we explain how an alternative way to define expressive and interpretable preferences using reference points can be exploited.


Planning and Acting in Incomplete Domains

AAAI Conferences

Engineering complete planning domain descriptions is often very costly because of human error or lack of domain knowl- edge. Learning complete domain descriptions is also very challenging because many features are irrelevant to achieving the goals and data may be scarce. We present a planner and agent that respectively plan and act in incomplete domains by i) synthesizing plans to avoid execution failure due to ignorance of the domain model, and ii) passively learning about the domain model during execution to improve later re-planning attempts. Our planner DeFault is the first to reason about a domain’s incompleteness to avoid potential plan failure. DeFault computes failure explanations for each action and state in the plan and counts the number of interpretations of the incomplete domain where failure will occur. We show that DeFault performs best by counting prime implicants (failure diagnoses) rather than propositional models. Our agent Goalie learns about the preconditions and effects of incompletely-specified actions while monitoring its state and, in conjunction with DeFault plan failure explanations, can diagnose past and future action failures. We show that by reasoning about incompleteness (as opposed to ignoring it) Goalie fails and re-plans less and executes fewer actions.


Planning to Perceive: Exploiting Mobility for Robust Object Detection

AAAI Conferences

Consider the task of a mobile robot autonomously navigating through an environment while detecting and mapping objects of interest using a noisy object detector. The robot must reach its destination in a timely manner, but is rewarded for correctly detecting recognizable objects to be added to the map, and penalized for false alarms. However, detector performance typically varies with vantage point, so the robot benefits from planning trajectories which maximize the efficacy of the recognition system. This work describes an online, any-time planning framework enabling the active exploration of possible detections provided by an off-the-shelf object detector. We present a probabilistic approach where vantage points are identified which provide a more informative view of a potential object. The agent then weighs the benefit of increasing its confidence against the cost of taking a detour to reach each identified vantage point. The system is demonstrated to significantly improve detection and trajectory length in both simulated and real robot experiments.


Learning Inadmissible Heuristics During Search

AAAI Conferences

Suboptimal search algorithms offer shorter solving times by sacrificing guaranteed solution optimality. While optimal searchalgorithms like A* and IDA* require admissible heuristics, suboptimalsearch algorithms need not constrain their guidance in this way. Previous work has explored using off-line training to transform admissible heuristics into more effective inadmissible ones. In this paper we demonstrate that this transformation can be performed on-line, during search. In addition to not requiring training instances and extensive pre-computation, an on-line approach allows the learned heuristic to be tailored to a specific problem instance. We evaluate our techniques in four different benchmark domains using both greedy best-first search and bounded suboptimal search. We find that heuristics learned on-line result in both faster search andbetter solutions while relying only on information readily available in any best-first search.


Exploiting the Computational Power of the Graphics Card: Optimal State Space Planning on the GPU

AAAI Conferences

In this paper optimal state space planning is parallelized by exploiting the processing power of a graphics card. The two exploration steps, namely selecting the actions to be applied and generating the successors, are performed on a graphics processing unit. Duplicate detection, however, is delayed to be executed on the central processing unit. Multiple cores are employed to bypass main memory latency. To increase processing speed for exact duplicate detection, the hash tables are lock-free. Moreover, a bucket-based representation enhances the concurrent distribution of frontier states. The planner supports cost-first exploration and is able to deal with a considerable fraction of current PDDL, including numerical state variables, complex objective functions, and goal preferences. It can maximize the net-benefit. Experimental findings show visible performance gains especially for larger benchmark problems.


Potential Search: A Bounded-Cost Search Algorithm

AAAI Conferences

In this paper we address the following search task: find a goal with cost smaller than or equal to a given fixed constant. This task is relevant in scenarios where a fixed budget is available to execute a plan and we would like to find such a plan with minimum search effort. We introduce an algorithm called Potential search (PTS) which is specifically designed to solve this problem. PTS is a best-first search that expands nodes according to the probability that they will be part of a plan whose cost is less than or equal to the given budget. We show that it is possible to implement PTS even without explicitly calculating these probabilities, when a heuristic function and knowledge about the error of this heuristic function are given. In addition, we also show that PTS can be modified to an anytime search algorithm. Experimental results show that PTS outperforms other relevant algorithms in most cases, and is more robust.