Search
Active Stratified Sampling with Clustering-Based Type Systems for Predicting the Search Tree Size of Problems with Real-Valued Heuristics
Lelis, Levi H. S. (University of Alberta)
In this paper we advance the line of research launched by Knuth which was later improved by Chen for predicting the size of the search tree expanded by heuristic search algorithms such as IDA*. Chen's Stratified Sampling (SS) uses a partition of the nodes in the search tree called type system to guide its sampling. Recent work has shown that SS using type systems based on integer-valued heuristic functions can be quite effective. However, type systems based on real-valued heuristic functions are often too large to be practical. We use the k-means clustering algorithm for creating effective type systems for domains with real-valued heuristics. Orthogonal to the type systems, another contribution of this paper is the introduction of an algorithm called Active SS. SS allocates the same number of samples for each type. Active SS is the application of the idea of active sampling to search trees. Active SS allocates more samples to the types with higher uncertainty. Our empirical results show that (i) SS using clustering-based type systems tends to produce better predictions than competing schemes that do not use a type system, and that (ii) Active SS can produce better predictions than the regular version of SS.
Bidirectional Preference-Based Search for State Space Graph Problems
Galand, Lucie (University Paris Dauphine) | Ismaili, Anisse (University Pierre et Marie Curie) | Perny, Patrice (University Pierre et Marie Curie) | Spanjaard, Olivier (University Pierre et Marie Curie)
In multiobjective state space graph problems, each solution-path is evaluated by a cost vector. These cost vectors can be partially or completely ordered using a preference relation compatible with Pareto dominance. In this context, multiobjective preference-based search (MOPBS) aims at computing the preferred feasible solutions according to a predefined preference model, these preferred solutions being a subset (possibly the entire set) of Pareto optima. Standard algorithms for MOPBS perform a unidirectional search developing the search tree forward from the initial state to a goal state. Instead, in this paper, we focus on bidirectional search algorithms developing simultaneously one forward and one backward search tree. Although bi-directional search has been tested in various single objective problems, its efficiency in a multiobjective setting has never been studied. In this paper, we present several implementations of bidirectional preference-based search convenient for the multiobjective case and investigate their efficiency.
Throwing Darts: Random Sampling Helps Tree Search when the Number of Short Certificates Is Moderate
Dickerson, John Paul (Carnegie Mellon University) | Sandholm, Tuomas (Carnegie Mellon University)
One typically proves infeasibility in satisfiability/constraint satisfaction (or optimality in integer programming) by constructing a tree certificate. However, deciding how to branch in the search tree is hard, and impacts search time drastically. We explore the power of a simple paradigm, that of throwing random darts into the assignment space and then using information gathered by that dart to guide what to do next. Such guidance is easy to incorporate into state-of-the-art solvers. This method seems to work well when the number of short certificates of infeasibility is moderate, suggesting the overhead of throwing darts can be countered by the information gained by these darts. We explore results supporting this suggestion both on instances from a new generator where the size and number of short certificates can be controlled, and on industral instances from the annual SAT competition.
Experimental Real-Time Heuristic Search Results in a Video Game
Burns, Ethan (University of New Hampshire) | Kiesel, Scott (University of New Hampshire) | Ruml, Wheeler (University of New Hampshire)
In real-time domains such as video games, a planning algo- rithm has a strictly bounded time before it must return the next action for the agent to execute. We introduce a realistic video game benchmark domain that is useful for evaluating real-time heuristic search algorithms. Unlike previous bench- marks such as grid pathfinding and the sliding tile puzzle, this new domain includes dynamics and induces a directed graph. Using both the previous and new domains, we investigate sev- eral enhancements to a leading real-time search algorithm, LSS-LRTA*. We show experimentally that 1) it is not dif- ficult to outperform A* when optimizing goal achievement time, 2) it is better to plan after each action than to commit to multiple actions or to use a dynamically sized lookahead, 3) A*-based lookahead can cause undesirable actions to be selected, and 4) on-line de-biasing of the heuristic can lead to improved performance. We hope that this new domain and results will stimulate further research on applying real-time search to dynamic real-time domains.
Parallelising the k-Medoids Clustering Problem Using Space-Partitioning
Arbelaez, Alejandro (JFLI / University of Tokyo) | Quesada, Luis (University College Cork)
The k-medoids problem is a combinatorial optimisation problem with multiples applications in Resource Allocation, Mobile Computing, Sensor Networks and Telecommunications.Real instances of this problem involve hundreds of thousands of points and thousands of medoids.Despite the proliferation of parallel architectures, this problem has been mostly tackled using sequential approaches.In this paper, we study the impact of space-partitioning techniques on the performance of parallel local search algorithms to tackle the k-medoids clustering problem, and compare these results with the ones obtained using sampling.Our experiments suggest that approaches relying on partitioning scale more while preserving the quality of the solution.
Anytime Truncated D* : Anytime Replanning with Truncation
Aine, Sandip (Carnegie Mellon University) | Likhachev, Maxim (Carnegie Mellon University)
Incremental heuristic searches reuse their previous search efforts to speed up the current search. Anytime search algorithms iteratively tune the solutions based on available search time. Anytime D* (AD*) is an incremental anytime search algorithm that combines these two approaches. AD* uses an inflated heuristic to produce bounded suboptimal solutions and improves the solution by iteratively decreasing the inflation factor. If the environment changes, AD* recomputes a new solution by propagating the new costs. Recently, a different approach to speed up replanning (TLPA*/TD* Lite) was proposed that relies on selective truncation of cost propagations instead of heuristic inflation. In this work, we present an algorithm called Anytime Truncated D* (ATD*) that combines heuristic inflation with truncation in an anytime fashion. We develop truncation rules that can work with an inflated heuristic without violating the completeness/suboptimality guarantees, and show how these rules can be applied in conjunction with heuristic inflation to iteratively refine the replanning solutions with minimal reexpansions. We explain ATD*, discuss its analytical properties and present experimental results for 2D and 3D (x, y, heading) path planning demonstrating its efficacy for anytime replanning.
Seven Challenges in Parallel SAT Solving
Hamadi, Youssef (Microsoft Research, 7 JJ Thomson Avenue, Cambridge CB3 0FB, United Kingdom) | Wintersteiger, Christoph (Microsoft Research, 7 JJ Thomson Avenue, Cambridge CB3 0FB, United Kingdom)
This paper provides a broad overview of the situation in Parallel SAT Solving. A set of challenges to researchers is presented which, we believe, must be met to ensure the practical applicability of Parallel SAT Solvers in the future. All these challenges are described informally, but put into perspective with related research results, and a (subjective) grading of difficulty for each of them is provided.
Solving Weighted Voting Game Design Problems Optimally: Representations, Synthesis, and Enumeration
de Keijzer, Bart, Klos, Tomas B., Zhang, Yingqian
We study the inverse power index problem for weighted voting games: the problem of finding a weighted voting game in which the power of the players is as close as possible to a certain target distribution. Our goal is to find algorithms that solve this problem exactly. Thereto, we study various subclasses of simple games, and their associated representation methods. We survey algorithms and impossibility results for the synthesis problem, i.e., converting a representation of a simple game into another representation. We contribute to the synthesis problem by showing that it is impossible to compute in polynomial time the list of ceiling coalitions (also known as shift-maximal losing coalitions) of a game from its list of roof coalitions (also known as shift-minimal winning coalitions), and vice versa. Then, we proceed by studying the problem of enumerating the set of weighted voting games. We present first a naive algorithm for this, running in doubly exponential time. Using our knowledge of the synthesis problem, we then improve on this naive algorithm, and we obtain an enumeration algorithm that runs in quadratic exponential time (that is, O(2^(n^2) p(n)) for a polynomial p). Moreover, we show that this algorithm runs in output-polynomial time, making it the best possible enumeration algorithm up to a polynomial factor. Finally, we propose an exact anytime algorithm for the inverse power index problem that runs in exponential time. This algorithm is straightforward and general: it computes the error for each game enumerated, and outputs the game that minimizes this error. By the genericity of our approach, our algorithm can be used to find a weighted voting game that optimizes any exponential time computable function. We implement our algorithm for the case of the normalized Banzhaf index, and we perform experiments in order to study performance and error convergence.
Breaking Symmetry with Different Orderings
We can break symmetry by eliminating solutions within each symmetry class. For instance, the Lex-Leader method eliminates all but the smallest solution in the lexicographical ordering. Unfortunately, the Lex-Leader method is intractable in general. We prove that, under modest assumptions, we cannot reduce the worst case complexity of breaking symmetry by using other orderings on solutions. We also prove that a common type of symmetry, where rows and columns in a matrix of decision variables are interchangeable, is intractable to break when we use two promising alternatives to the lexicographical ordering: the Gray code ordering (which uses a different ordering on solutions), and the Snake-Lex ordering (which is a variant of the lexicographical ordering that re-orders the variables). Nevertheless, we show experimentally that using other orderings like the Gray code to break symmetry can be beneficial in practice as they may better align with the objective function and branching heuristic.
The Arcade Learning Environment: An Evaluation Platform for General Agents
Bellemare, Marc G., Naddaf, Yavar, Veness, Joel, Bowling, Michael
In this article we introduce the Arcade Learning Environment (ALE): both a challenge problem and a platform and methodology for evaluating the development of general, domain-independent AI technology. ALE provides an interface to hundreds of Atari 2600 game environments, each one different, interesting, and designed to be a challenge for human players. ALE presents significant research challenges for reinforcement learning, model learning, model-based planning, imitation learning, transfer learning, and intrinsic motivation. Most importantly, it provides a rigorous testbed for evaluating and comparing approaches to these problems. We illustrate the promise of ALE by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning. In doing so, we also propose an evaluation methodology made possible by ALE, reporting empirical results on over 55 different games. All of the software, including the benchmark agents, is publicly available.