Search
Computational Aspects of Reordering Plans
This article studies the problem of modifying the action ordering of a plan in order to optimise the plan according to various criteria. One of these criteria is to make a plan less constrained and the other is to minimize its parallel execution time. Three candidate definitions are proposed for the first of these criteria, constituting a sequence of increasing optimality guarantees. Two of these are based on deordering plans, which means that ordering relations may only be removed, not added, while the third one uses reordering, where arbitrary modifications to the ordering are allowed. It is shown that only the weakest one of the three criteria is tractable to achieve, the other two being NP-hard and even difficult to approximate. Similarly, optimising the parallel execution time of a plan is studied both for deordering and reordering of plans. In the general case, both of these computations are NP-hard. However, it is shown that optimal deorderings can be computed in polynomial time for a class of planning languages based on the notions of producers, consumers and threats, which includes most of the commonly used planning languages. Computing optimal reorderings can potentially lead to even faster parallel executions, but this problem remains NP-hard and difficult to approximate even under quite severe restrictions.
Squeaky Wheel Optimization
Clements, D. P., Joslin, D. E.
We describe a general approach to optimization which we term `Squeaky Wheel' Optimization (SWO). In SWO, a greedy algorithm is used to construct a solution which is then analyzed to find the trouble spots, i.e., those elements, that, if improved, are likely to improve the objective function score. The results of the analysis are used to generate new priorities that determine the order in which the greedy algorithm constructs the next solution. This Construct/Analyze/Prioritize cycle continues until some limit is reached, or an acceptable solution is found. SWO can be viewed as operating on two search spaces: solutions and prioritizations. Successive solutions are only indirectly related, via the re-prioritization that results from analyzing the prior solution. Similarly, successive prioritizations are generated by constructing and analyzing solutions. This `coupled search' has some interesting properties, which we discuss. We report encouraging experimental results on two domains, scheduling problems that arise in fiber-optic cable manufacturing, and graph coloring problems. The fact that these domains are very different supports our claim that SWO is a general technique for optimization.
Minimax Policies for Combinatorial Prediction Games
Audibert, Jean-Yves, Bubeck, Sebastien, Lugosi, Gabor
We address the online linear optimization problem when the actions of the forecaster are represented by binary vectors. Our goal is to understand the magnitude of the minimax regret for the worst possible set of actions. We study the problem under three different assumptions for the feedback: full information, and the partial information models of the so-called "semi-bandit", and "bandit" problems. We consider both $L_\infty$-, and $L_2$-type of restrictions for the losses assigned by the adversary. We formulate a general strategy using Bregman projections on top of a potential-based gradient descent, which generalizes the ones studied in the series of papers Gyorgy et al. (2007), Dani et al. (2008), Abernethy et al. (2008), Cesa-Bianchi and Lugosi (2009), Helmbold and Warmuth (2009), Koolen et al. (2010), Uchiya et al. (2010), Kale et al. (2010) and Audibert and Bubeck (2010). We provide simple proofs that recover most of the previous results. We propose new upper bounds for the semi-bandit game. Moreover we derive lower bounds for all three feedback assumptions. With the only exception of the bandit game, the upper and lower bounds are tight, up to a constant factor. Finally, we answer a question asked by Koolen et al. (2010) by showing that the exponentially weighted average forecaster is suboptimal against $L_{\infty}$ adversaries.
Contingent Planning as AND/OR Forward Search with Disjunctive Representation
To, Son Thanh (New Mexico State University) | Son, Tran Cao (New Mexico State University) | Pontelli, Enrico (New Mexico State University)
This paper introduces a highly competitive contingent planner, that uses the novel idea of encoding belief states as disjunctive normal form formulae (To et al. 2009), for the search for solutions in the belief state space. In (To et al. 2009), a complete transition function for computing successor belief states in the presence of incomplete information has been defined. This work extends the function to handle non-deterministic and sensing actions in the AND/OR forward search paradigm for contingent planning solutions. The function allows one, under reasonable assumptions, to compute successor belief states efficiently, i.e., in polynomial time. The paper also presents a novel variant of an AND/OR search algorithm, called PrAO (Pruning AND/OR search), which allows the planner to significantly prune the search space; furthermore, by the time a solution is found, the remaining search graph is also the solution tree for the contingent planing problem. The strength of these techniques is confirmed by the empirical results obtained from a large set of benchmarks available in the literature.
Fast Subgoaling for Pathfinding via Real-Time Search
Hernandez, Carlos (Universidad Católica de la Santísima Concepción) | Baier, Jorge A. (Pontificia Universidad Católica de Chile)
Real-time heuristic search is a standard approach to pathfind- ing when agents are required to make decisions in a bounded, very short period of time. An assumption usually made in the development and evaluation of real-time algorithms is that the environment is unknown. Nevertheless, in many interesting applications such as pathfinding for automnomous characters in video games, the environment is known in advance. Recent real-time search algorithms such as D LRTA* and kNN LRTA* exploit knowledge about the environment while pathfinding under real-time constraints. Key to those algorithms is the computation of subgoals in a preprocessing step. Subgoals are subsequently used in the online planning phase to obtain high-quality solutions. Preprocessing in those algorithms, however, requires significant computation. In this paper we propose a novel preprocessing algorithm that generates subgoals using a series of backward search episodes carried out from potential goals. The result of a single backward search episode is a tree of subgoals that we then use while planning online. We show the advantages of our approach over state-of-the-art algorithms by carrying out experiments on standard real-time search benchmarks.
Dynamic State-Space Partitioning in External-Memory Graph Search
Zhou, Rong (Palo Alto Research Center) | Hansen, Eric A. (Mississippi State University)
The scalability of optimal sequential planning can be improved by using external-memory graph search. State-of-the-art external-memory graph search algorithms rely on a state-space projection function, or hash function, that partitions the stored nodes of the state-space search graph into groups of nodes that are stored as separate files on disk. Search performance depends on properties of the partition; whether the number of unique nodes in a file always fits in RAM, the number of files into which the nodes of the state-space graph are partitioned, and how well the partition captures local structure in the graph. Previous work relies on a static partition of the state space, but it can be difficult for a static partition to simultaneously satisfy all of these criteria. We introduce a method for dynamic partitioning and show that it leads to improved search performance in solving STRIPS planning problems.
Planning and Acting in Incomplete Domains
Weber, Christopher (Utah State University) | Bryce, Daniel (Utah State University)
Engineering complete planning domain descriptions is often very costly because of human error or lack of domain knowl- edge. Learning complete domain descriptions is also very challenging because many features are irrelevant to achieving the goals and data may be scarce. We present a planner and agent that respectively plan and act in incomplete domains by i) synthesizing plans to avoid execution failure due to ignorance of the domain model, and ii) passively learning about the domain model during execution to improve later re-planning attempts. Our planner DeFault is the first to reason about a domain’s incompleteness to avoid potential plan failure. DeFault computes failure explanations for each action and state in the plan and counts the number of interpretations of the incomplete domain where failure will occur. We show that DeFault performs best by counting prime implicants (failure diagnoses) rather than propositional models. Our agent Goalie learns about the preconditions and effects of incompletely-specified actions while monitoring its state and, in conjunction with DeFault plan failure explanations, can diagnose past and future action failures. We show that by reasoning about incompleteness (as opposed to ignoring it) Goalie fails and re-plans less and executes fewer actions.
Learning Inadmissible Heuristics During Search
Thayer, Jordan Tyler (University of New Hampshire) | Dionne, Austin (University of New Hampshire) | Ruml, Wheeler (University of New Hampshire)
Suboptimal search algorithms offer shorter solving times by sacrificing guaranteed solution optimality. While optimal searchalgorithms like A* and IDA* require admissible heuristics, suboptimalsearch algorithms need not constrain their guidance in this way. Previous work has explored using off-line training to transform admissible heuristics into more effective inadmissible ones. In this paper we demonstrate that this transformation can be performed on-line, during search. In addition to not requiring training instances and extensive pre-computation, an on-line approach allows the learned heuristic to be tailored to a specific problem instance. We evaluate our techniques in four different benchmark domains using both greedy best-first search and bounded suboptimal search. We find that heuristics learned on-line result in both faster search andbetter solutions while relying only on information readily available in any best-first search.
Exploiting the Computational Power of the Graphics Card: Optimal State Space Planning on the GPU
Sulewski, Damian (TZI, Universität Bremen) | Edelkamp, Stefan (TZI, Universität Bremen) | Kissmann, Peter (TZI, Universität Bremen)
In this paper optimal state space planning is parallelized by exploiting the processing power of a graphics card. The two exploration steps, namely selecting the actions to be applied and generating the successors, are performed on a graphics processing unit. Duplicate detection, however, is delayed to be executed on the central processing unit. Multiple cores are employed to bypass main memory latency. To increase processing speed for exact duplicate detection, the hash tables are lock-free. Moreover, a bucket-based representation enhances the concurrent distribution of frontier states. The planner supports cost-first exploration and is able to deal with a considerable fraction of current PDDL, including numerical state variables, complex objective functions, and goal preferences. It can maximize the net-benefit. Experimental findings show visible performance gains especially for larger benchmark problems.
Potential Search: A Bounded-Cost Search Algorithm
Stern, Roni Tzvi (Ben Gurion University of the Negev) | Puzis, Rami (Ben Gurion University of the Negev) | Felner, Ariel (Ben Gurion University of the Negev)
In this paper we address the following search task: find a goal with cost smaller than or equal to a given fixed constant. This task is relevant in scenarios where a fixed budget is available to execute a plan and we would like to find such a plan with minimum search effort. We introduce an algorithm called Potential search (PTS) which is specifically designed to solve this problem. PTS is a best-first search that expands nodes according to the probability that they will be part of a plan whose cost is less than or equal to the given budget. We show that it is possible to implement PTS even without explicitly calculating these probabilities, when a heuristic function and knowledge about the error of this heuristic function are given. In addition, we also show that PTS can be modified to an anytime search algorithm. Experimental results show that PTS outperforms other relevant algorithms in most cases, and is more robust.