Collaborating Authors

best-first search

Learning Heuristic Selection with Dynamic Algorithm Configuration Artificial Intelligence

A key challenge in satisfying planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single heuristic can negatively affect the whole search. Since the performance of a heuristic varies from instance to instance, approaches such as algorithm selection can be successfully applied. In addition, alternating between multiple heuristics during the search makes it possible to use all heuristics equally and improve performance. However, all these approaches ignore the internal search dynamics of a planning system, which can help to select the most helpful heuristics for the current expansion step. We show that dynamic algorithm configuration can be used for dynamic heuristic selection which takes into account the internal search dynamics of a planning system. Furthermore, we prove that this approach generalizes over existing approaches and that it can exponentially improve the performance of the heuristic search. To learn dynamic heuristic selection, we propose an approach based on reinforcement learning and show empirically that domain-wise learned policies, which take the internal search dynamics of a planning system into account, can exceed existing approaches in terms of coverage.

Learning Neural Search Policies for Classical Planning Artificial Intelligence

Heuristic forward search is currently the dominant paradigm in classical planning. Forward search algorithms typically rely on a single, relatively simple variation of best-first search and remain fixed throughout the process of solving a planning problem. Existing work combining multiple search techniques usually aims at supporting best-first search with an additional exploratory mechanism, triggered using a handcrafted criterion. A notable exception is very recent work which combines various search techniques using a trainable policy. It is, however, confined to a discrete action space comprising several fixed subroutines. In this paper, we introduce a parametrized search algorithm template which combines various search techniques within a single routine. The template's parameter space defines an infinite space of search algorithms, including, among others, BFS, local and random search. We further introduce a neural architecture for designating the values of the search parameters given the state of the search. This enables expressing neural search policies that change the values of the parameters as the search progresses. The policies can be learned automatically, with the objective of maximizing the planner's performance on a given distribution of planning problems. We consider a training setting based on a stochastic optimization algorithm known as the cross-entropy method (CEM). Experimental evaluation of our approach shows that it is capable of finding effective distribution-specific search policies, outperforming the relevant baselines.

Revisiting Suboptimal Search

AAAI Conferences

Suboptimal search algorithms can often solve much larger problems than optimal search algorithms, and thus have broad practical use. This paper returns to early algorithms like WA*, A*_e and Optimistic search. It studies the commonalities between these approaches in order to build a new bounded-suboptimal algorithm. Combined with recent research on avoiding node re-expansions in bounded-optimal search, a new solution quality bound is developed, which often provides proof of the solution bound much earlier during the search. Put together, these ideas provide a new state-of-the-art in bounded-optimal search.

A Learning-Based Framework for Memory-Bounded Heuristic Search: First Results

AAAI Conferences

Many existing boundedly-suboptimal heuristic search algorithms are variants of best-first search. Due to memory limitations, these algorithms are unable to solve problems with extremely large search spaces. In this paper, we present a framework that allows best-first search algorithms to solve problems with such large search spaces given a (reasonable) memory bound while also preserving optimality guarantees in tree-structured search spaces. In our framework, a given algorithm is run several times. In each search episode, the algorithm expands up to a user-defined number of states. After each episode, unless the goal has been found, the heuristic values of the generated states are updated using a linear-time algorithm that preserves consistency in tree-structured search spaces. In subsequent search episodes, only the heuristic values of the states generated in the previous episode need to be kept in memory. We present experimental results where we plug A*, GBFS, and wA* into our framework to solve traveling salesman problems and compare them against benchmark linear-memory algorithms like DFBnB and wDFBnB.

SPoC: Search-based Pseudocode to Code Machine Learning

We consider the task of mapping pseudocode to long programs that are functionally correct. Given test cases as a mechanism to validate programs, we search over the space of possible translations of the pseudocode to find a program that passes the validation. However, without proper credit assignment to localize the sources of program failures, it is difficult to guide search toward more promising programs. We propose to perform credit assignment based on signals from compilation errors, which constitute 88.7% of program failures. Concretely, we treat the translation of each pseudocode line as a discrete portion of the program, and whenever a synthesized program fails to compile, an error localization method tries to identify the portion of the program responsible for the failure. We then focus search over alternative translations of the pseudocode for those portions. For evaluation, we collected the SPoC dataset (Search-based Pseudocode to Code) containing 18,356 programs with human-authored pseudocode and test cases. Under a budget of 100 program compilations, performing search improves the synthesis success rate over using the top-one translation of the pseudocode from 25.6% to 44.7%.

Front-to-Front Bidirectional Best-First Search Reconsidered

AAAI Conferences

We present several new algorithms for bidirectional best-first search that employ a front-to-front strategy of estimating distances from newly-generated frontier nodes in one search direction to existing frontier nodes in the other search direction, rather than estimating distances to terminal nodes in both searches. Unlike previous front-to-front strategies that use a shared priority queue to manage both frontiers, we use a separate data structure for each search, and choose that data structure to minimize the amount of computational effort required by the best-first search algorithm it supports. We demonstrate several results. First, we show that Bidirectional Front-to-Front Greedy (BFFG) is able to quickly find sub-optimal solutions to very large statespace problems and with a small fraction of nodes expanded (and stored) compared to other unidirectional and bidirectional greedy techniques. Secondly, we show that Bidirectional Front-to-Front A* (BFFA*) similarly outperforms both Unidirectional A* and Bidirectional Front-to-End A* (BFEA*) in terms of node expansions when searching for optimal solutions. Finally, we describe three improvements to BFFA*, each of which reduces the overall runtime by limiting the number of opposing frontier nodes that need be considered while preserving the optimality criterion.

Learning Classical Planning Strategies with Policy Gradient Artificial Intelligence

A common paradigm in classical planning is heuristic forward search. Forward search planners often rely on relatively simple best-first search algorithm, which remains fixed throughout the search process. In this paper, we introduce a novel search framework capable of alternating between several forward search approaches while solving a particular planning problem. Selection of the approach is performed using a trainable stochastic policy. This enables tailoring the search strategy to a particular distribution of planning problems and a selected performance metric, such as the IPC score or running time. We construct a strategy space using five search algorithms and a two-dimensional representation of the planner's state. Strategies are then trained on randomly generated planning problems using policy gradient. Experimental results show that the learner is able to discover domain-specific search strategies, thus improving the planner's performance with respect to the chosen metric.

AND/OR Search for Marginal MAP

Journal of Artificial Intelligence Research

Mixed inference such as the marginal MAP query (some variables marginalized by summation and others by maximization) is key to many prediction and decision models. It is known to be extremely hard; the problem is NPPP-complete while the decision problem for MAP is only NP-complete and the summation problem is #P-complete. Consequently, approximation anytime schemes are essential. In this paper, we show that the framework of heuristic AND/OR search, which exploits conditional independence in the graphical model, coupled with variational-based mini-bucket heuristics can be extended to this task and yield powerful state-of-the-art schemes. Specifically, we explore the complementary properties of best-first search for reducing the number of conditional sums and providing time-improving upper bounds, with depth-first search for rapidly generating and improving solutions and lower bounds. We show empirically that a class of solvers that interleaves depth-first with best-first schemes emerges as the most competitive anytime scheme.

Anytime Anyspace AND/OR Best-First Search for Bounding Marginal MAP

AAAI Conferences

Marginal MAP is a key task in Bayesian inference and decision-making. It is known to be very difficult in general, particularly because the evaluation of each MAP assignment requires solving an internal summation problem. In this paper, we propose a best-first search algorithm that provides anytime upper bounds for marginal MAP in graphical models. It folds the computation of external maximization and internal summation into an AND/OR tree search framework, and solves them simultaneously using a unified best-first search algorithm. The algorithm avoids some unnecessary computation of summation sub-problems associated with MAP assignments, and thus yields significant time savings. Furthermore, our algorithm is able to operate within limited memory. Empirical evaluation on three challenging benchmarks demonstrates that our unified best-first search algorithm using pre-compiled variational heuristics often provides tighter anytime upper bounds compared to those state-of-the-art baselines.

A Polynomial Planning Algorithm That Beats LAMA and FF

AAAI Conferences

It has been shown recently that heuristic and width-based search can be combined to produce planning algorithms with a performance that goes beyond the state-of-the-art. Such algorithms are based on best-first width search (BFWS), a plain best-first search set with evaluations functions combined lexicographically to break ties, some of which express novelty based preferences. In BFWS(f5), for example, the evaluation function f5 weights nodes by a novelty measure, breaking ties by the number of non-achieved goals. BFWS(f5) is a best-first algorithm, and hence, it is complete but not polynomial, and its performance doesn’t match the state of the art. In this work we show, however, that incomplete versions of BFWS(f5) where nodes with novelty greater than k are pruned, are not only polynomial but have an empirical performance that is better than both BFWS(f5) and state-of-the-art planners. This is shown by considering all the international planning competition instances. This is the first time where polynomial algorithms with meaningful bounds are shown to achieve state-of-the-art performance in planning. Practical and theoretical implications of this empirical finding are briefly sketched.