Goto

Collaborating Authors

 greedy best-first search


Parallel Greedy Best-First Search with a Bound on the Number of Expansions Relative to Sequential Search

arXiv.org Artificial Intelligence

Parallelization of non-admissible search algorithms such as GBFS poses a challenge because straightforward parallelization can result in search behavior which significantly deviates from sequential search. Previous work proposed PUHF, a parallel search algorithm which is constrained to only expand states that can be expanded by some tie-breaking strategy for GBFS. We show that despite this constraint, the number of states expanded by PUHF is not bounded by a constant multiple of the number of states expanded by sequential GBFS with the worst-case tie-breaking strategy. We propose and experimentally evaluate One Bench At a Time (OBAT), a parallel greedy search which guarantees that the number of states expanded is within a constant factor of the number of states expanded by sequential GBFS with some tie-breaking policy.


Scale-Adaptive Balancing of Exploration and Exploitation in Classical Planning

arXiv.org Artificial Intelligence

Balancing exploration and exploitation has been an important problem in both game tree search and automated planning. However, while the problem has been extensively analyzed within the Multi-Armed Bandit (MAB) literature, the planning community has had limited success when attempting to apply those results. We show that a more detailed theoretical understanding of MAB literature helps improve existing planning algorithms that are based on Monte Carlo Tree Search (MCTS) / Trial Based Heuristic Tree Search (THTS). In particular, THTS uses UCB1 MAB algorithms in an ad hoc manner, as UCB1's theoretical requirement of fixed bounded support reward distributions is not satisfied within heuristic search for classical planning. The core issue lies in UCB1's lack of adaptations to the different scales of the rewards. We propose GreedyUCT-Normal, a MCTS/THTS algorithm with UCB1-Normal bandit for agile classical planning, which handles distributions with different scales by taking the reward variance into consideration, and resulted in an improved algorithmic performance (more plans found with less node expansions) that outperforms Greedy Best First Search and existing MCTS/THTS-based algorithms (GreedyUCT,GreedyUCT*).


Learning Heuristic Selection with Dynamic Algorithm Configuration

arXiv.org Artificial Intelligence

A key challenge in satisfying planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single heuristic can negatively affect the whole search. Since the performance of a heuristic varies from instance to instance, approaches such as algorithm selection can be successfully applied. In addition, alternating between multiple heuristics during the search makes it possible to use all heuristics equally and improve performance. However, all these approaches ignore the internal search dynamics of a planning system, which can help to select the most helpful heuristics for the current expansion step. We show that dynamic algorithm configuration can be used for dynamic heuristic selection which takes into account the internal search dynamics of a planning system. Furthermore, we prove that this approach generalizes over existing approaches and that it can exponentially improve the performance of the heuristic search. To learn dynamic heuristic selection, we propose an approach based on reinforcement learning and show empirically that domain-wise learned policies, which take the internal search dynamics of a planning system into account, can exceed existing approaches in terms of coverage.


Improving Greedy Best-First Search by Removing Unintended Search Bias (Extended Abstract)

AAAI Conferences

Recent enhancements to greedy best-first search (GBFS) improve performance by occasionally adopting a non-greedy node expansion policy, resulting in more exploratory behavior. However, previous exploratory mechanisms do not address exploration within the space sharing the same heuristic estimate (plateau) and the search bias in a breadth direction. In this abstract, we briefly describe two modes of exploration (diversification), which work inter-(across) and intra-(within) plateau, and also introduce IP-diversification, a method combining Minimum Spanning Tree and randomization, which addresses “breadth”-bias instead of the “depth”-bias addressed by the existing methods.


Effective Heuristics for Suboptimal Best-First Search

Journal of Artificial Intelligence Research

Suboptimal heuristic search algorithms such as weighted A* and greedy best-first search are widely used to solve problems for which guaranteed optimal solutions are too expensive to obtain. These algorithms crucially rely on a heuristic function to guide their search. However, most research on building heuristics addresses optimal solving. In this paper, we illustrate how established wisdom for constructing heuristics for optimal search can fail when considering suboptimal search. We consider the behavior of greedy best-first search in detail and we test several hypotheses for predicting when a heuristic will be effective for it. Our results suggest that a predictive characteristic is a heuristic's goal distance rank correlation (GDRC), a robust measure of whether it orders nodes according to distance to a goal. We demonstrate that GDRC can be used to automatically construct abstraction-based heuristics for greedy best-first search that are more effective than those built by methods oriented toward optimal search. These results reinforce the point that suboptimal search deserves sustained attention and specialized methods of its own.


Speedy versus Greedy Search

AAAI Conferences

When an optimal solution is not required, satisficing search methods such as greedy best-first search are often used to find solutions quickly. In work on satisficing search, there has been substantial attention devoted to how to solve problems associated with local minima or plateaus in the heuristic function. One technique that has been shown to be quite promising is using an alternative heuristic function that does not estimate cost-to-go, but rather estimates distance-to-go. There is currently little beyond intuition to explain its superiority. We begin by empirically showing that the success of the distance-to-go heuristic appears related to its having smaller local minima. We then discuss a reasonable theoretical model of heuristics and show that, under this model, the expected size of local minima is higher for a cost-to-go heuristic than a distance-to-go heuristic, offering a possible explanation as to why distance-to-go heuristics tend to outperform cost-to-go heuristics.


Building a Heuristic for Greedy Search

AAAI Conferences

Suboptimal heuristic search algorithms such as greedy best-first search allow us to find solutions when constraints of either time, memory, or both prevent the application of optimal algorithms such as A*. Guidelines for building an effective heuristic for A* are well established in the literature, but we show that if those rules are applied for greedy best-first search, performance can actually degrade. Observing what went wrong for greedy best-first search leads us to a quantitative metric appropriate for greedy heuristics, called Goal Distance Rank Correlation (GDRC). We demonstrate that GDRC can be used to build effective heuristics for greedy best-first search automatically.


A Novel Technique for Avoiding Plateaus of Greedy Best-First Search in Satisficing Planning

AAAI Conferences

Heuristic functions play an important role in drastically improving performance of satisficing planners based on greedy best-first search (GBFS). While automatic generation of heuristic functions (e.g., (Hoffmann and Nebel 2001; Helmert 2006)) enables state-of-the-art satisficing planners to solve very complicated planning problems including benchmarks in the International Planning Competitions, accurate evaluations of nodes still remain as a challenging task. Although GBFS is fundamental and powerful in planning, it has an essential drawback when heuristic functions return inaccurate estimates. Assume that a heuristic function underestimates the difficulties of unpromising nodes. Then, since GBFS must expand nodes with small heuristic values first, it spends most of time in searching only unpromising areas and delays moving to the promising part. Previous work tackles this issue by adding a diversity to search, which is an ability in simultaneously exploring different parts of the search space to bypass large errors in heuristic functions. Several algorithms combined with diversity (e.g., K-best-first search (KBFS) in (Felner, Kraus, and Korf 2003)) are empirically shown to be superior to naive best-first search algorithms. However, they still have limited diversity, since they do not immediately expand nodes mistakenly evaluated as very unpromising ones. This paper presents a new technique called diverse best-first search (DBFS) , which incorporates a diversity into search in a different way than previous search-based approaches. We show empirical results clearly showing that DBFS is effective in satisficing planning.


Learning Inadmissible Heuristics During Search

AAAI Conferences

Suboptimal search algorithms offer shorter solving times by sacrificing guaranteed solution optimality. While optimal searchalgorithms like A* and IDA* require admissible heuristics, suboptimalsearch algorithms need not constrain their guidance in this way. Previous work has explored using off-line training to transform admissible heuristics into more effective inadmissible ones. In this paper we demonstrate that this transformation can be performed on-line, during search. In addition to not requiring training instances and extensive pre-computation, an on-line approach allows the learned heuristic to be tailored to a specific problem instance. We evaluate our techniques in four different benchmark domains using both greedy best-first search and bounded suboptimal search. We find that heuristics learned on-line result in both faster search andbetter solutions while relying only on information readily available in any best-first search.