Country
Scalable Distributed Monte-Carlo Tree Search
Yoshizoe, Kazuki (The University of Tokyo) | Kishimoto, Akihiro (Tokyo Institute of Technology and Japan Science and Technology Agency) | Kaneko, Tomoyuki (The University of Tokyo) | Yoshimoto, Haruhiro (The University of Tokyo) | Ishikawa, Yutaka (The University of Tokyo)
Monte-Carlo Tree Search (MCTS) is remarkably successful in two-player games, but parallelizing MCTS has been notoriously difficult to scale well, especially in distributed environments. For a distributed parallel search, transposition-table driven scheduling (TDS) is known to be efficient in several domains. We present a massively parallel MCTS algorithm, that applies the TDS parallelism to the Upper Confidence bound Applied to Trees (UCT) algorithm, which is the most representative MCTS algorithm. To drastically decrease communication overhead, we introduce a reformulation of UCT called Depth-First UCT. The parallel performance of the algorithm is evaluated on clusters using up to 1,200 cores in artificial game-trees. We show that this approach scales well, achieving 740-fold speedups in the best case.
Abstract: Block A* and Any-Angle Path-Planning
Yap, Peter Kai Yue (University of Alberta) | Burch, Neil (University of Alberta) | Holte, Robert C. (University of Alberta) | Schaeffer, Jonathan (University of Alberta)
We present three new ideas for grid-based path-planning algorithms that improve the search speed and quality of the paths found. First, we introduce a new type of database, the Local Distance Database (LDDB), that contains distances between boundary points of a local neighborhood. Second, an LDDB-based algorithm is introduced, called Block A*, that calculates the optimal path between start and goal locations given the local distances stored in the LDDB. Third, our experimental results for any-angle path planning in a wide varietyof test domains, including real game maps, show that Block A* is faster than both A* and the previously best grid-based any-angle search algorithm, Theta*.
On Improving the Quality of Solutions in Large-Scale Cooperative Multi-Agent Pathfinding
Wang, Ko-Hsin Cindy (The Australian National University and NICTA) | Botea, Adi (NICTA and The Australian National University) | Kilby, Philip (NICTA and The Australian National University)
Scaling up the number of simultaneously moving units in pathfinding problems to hundreds, or even thousands, is well beyond the capability of theoretically optimal algorithms in practice, which is consistent with existing intractability results. Significant scalability can be achieved by trading off solution optimality, which motivates evaluating the quality of suboptimal solutions, especially in instances much larger than can be handled by optimal algorithms. We consider pathfinding in uniform cost grid maps, and we study the solution quality using the three most common quality criteria, total travel distance , sum of actions , and makespan . We focus on MAPP, which has been shown as state-of-the-art in terms of scalability and success ratio (i.e., percentage of solved units) on realistic game grid maps. Until now, the quality of MAPP's solutions had not been as extensively analyzed. We introduced enhancements that significantly improve MAPP's solution quality. For example, its sum of actions is cut to half on average. MAPP becomes competitive in terms of solution quality with FAR and WHCA*, two successful algorithms from the literature. To evaluate the quality of suboptimal solutions in instances beyond the capability of optimal algorithms, we use lower bounds of optimal values to show our solutions have a reasonable quality. For example, MAPP's average total travel distance is 19 percent longer than the lower bound.
Distance Learning in Agent-Centered Heuristic Search
Sturtevant, Nathan R. (University of Denver)
Real-time agent-centric algorithms have been used for learning and solving problems since the introduction of the LRTA* algorithm in 1990. In this time period, numerous variants have been produced, however, they have generally followed the same approach in varying parameters to learn a heuristic which estimates the remaining cost to arrive at a goal state. This short paper discusses the history and implications of learning g-costs, both alone and in conjunction with learning h-costs as an introduction to the new f-LRTA* algorithm which learns both.
Graduated Fidelity Motion Planning
Pivtoraiko, Mihail (Carnegie Mellon University) | Kelly, Alonzo (Carnegie Mellon University)
This paper presents an approach to differentially constrained robot motion planning and efficient re-planning. Satisfaction of differential constraints is guaranteed by the search space which consists of motions that satisfy the constraints by construction. Any systematic replanning algorithm, e.g. D*, can be utilized to search the state lattice to find a motion plan that satisfies the differential constraints, and to repair it efficiently in the event of a change in the environment. Further efficiency is obtained by varying the fidelity of representation of the planning problem. High fidelity is utilized where it matters most, while it is lowered in the areas that do not affect the quality of the plan significantly. The paper presents a method of modifying the fidelity between replans, thereby enabling dynamic flexibility of the search space, while maintaining its compatibility with replanning algorithms. The approach is especially suited for mobile robotics applications in unknown challenging environments. We successfully applied the motion planner on a real robot: the planner featured 10Hz average replan rate on minimal computing hardware, while satisfying the car-like differential constraints.
Planning in Domains with Cost Function Dependent Actions
Phillips, Mike (Carnegie Mellon University) | Likhachev, Maxim (Carnegie Mellon University)
In a number of graph search-based planning problems, the value of the cost function that is being minimized also affects the set of possible actions at some or all the states in the graph. In such planning problems, the cost function typically becomes one of the state variables thereby increasing the dimensionality of the planning problem, and consequently the size of the graph that represents the problem. In this paper, we show how to avoid this increase in the dimensionality for weighted search (with bounded suboptimality) whenever the availability of the actions is monotonically non-increasing with the increase in the cost function.
Efficient and Complete Centralized Multi-Robot Path Planning
Luna, Ryan (University of Nevada, Reno) | Bekris, Kostas E. (University of Nevada, Reno)
Multi-robot path planning is abstracted as the problem of computing a set of non-colliding paths on a graph for multiple robots. A naive search of the composite search space, although complete, has exponential complexity and becomes computationally prohibitive for problems with just a few robots. This work proposes an efficient and complete algorithm for solving a general class of multi-robot path planning problems, specifically those where there are at most n-2 robots in a connected graph of n vertices. The algorithm employs two primitives: a "push" operation where a robot moves toward its goal until no further progress can be made, and a "swap" operation that allows two robots to swap positions without altering the configuration of any other robot. Simulated experiments compare the proposed approach with several other centralized and decoupled planners, and show that the proposed technique has highly competitive computation time and easily scales to problems involving 100s of robots, solving them in under 5 seconds.
Planning for Landing Site Selection in the Aerial Supply Delivery
Kushleyev, Aleksandr (University of Pennsylvania) | MacAllister, Brian (University of Pennsylvania) | Likhachev, Maxim (Carnegie Mellon University)
In the aerial supply delivery problem, an un-manned aircraft needs to deliver supplies as close as possible to the desired goal location. This involves choosing and landing at a landing site that is closest to or most accessible from the desired goal location. The problem is complicated by the fact that the status of candidate landing sites is unknown before the mission begins, and instead the aircraft needs to compute a sequence according to which it flies and senses the candidate landing sites in order to land as quickly as possible. The problem of computing this sequence corresponds to planning under uncertainty about environment. In this paper, we show how it can be solved efficiently via a recently developed probabilistic planning framework, called Probabilistic Planning with Clear Preferences (PPCP). We show that the problem satisfies the Clear Preferences assumption required by PPCP,and therefore all the theoretical guarantees continue to hold. The experimental results in simulation show that our approachcan solve large-scale problems in real-time while experiments on a physical quad-rotor provide proof of concept.
A Novel Technique for Avoiding Plateaus of Greedy Best-First Search in Satisficing Planning
Imai, Tatsuya (Tokyo Institute of Technology) | Kishimoto, Akihiro (Tokyo Institute of Technology)
Heuristic functions play an important role in drastically improving performance of satisficing planners based on greedy best-first search (GBFS). While automatic generation of heuristic functions (e.g., (Hoffmann and Nebel 2001; Helmert 2006)) enables state-of-the-art satisficing planners to solve very complicated planning problems including benchmarks in the International Planning Competitions, accurate evaluations of nodes still remain as a challenging task. Although GBFS is fundamental and powerful in planning, it has an essential drawback when heuristic functions return inaccurate estimates. Assume that a heuristic function underestimates the difficulties of unpromising nodes. Then, since GBFS must expand nodes with small heuristic values first, it spends most of time in searching only unpromising areas and delays moving to the promising part. Previous work tackles this issue by adding a diversity to search, which is an ability in simultaneously exploring different parts of the search space to bypass large errors in heuristic functions. Several algorithms combined with diversity (e.g., K-best-first search (KBFS) in (Felner, Kraus, and Korf 2003)) are empirically shown to be superior to naive best-first search algorithms. However, they still have limited diversity, since they do not immediately expand nodes mistakenly evaluated as very unpromising ones. This paper presents a new technique called diverse best-first search (DBFS) , which incorporates a diversity into search in a different way than previous search-based approaches. We show empirical results clearly showing that DBFS is effective in satisficing planning.
Optimal Packing of High-Precision Rectangles
Huang, Eric (Palo Alto Research Center) | Korf, Richard E. (University of California, Los Angeles)
The rectangle-packing problem consists of finding an enclosing rectangle of smallest area that can contain a given set of rectangles without overlap. Our new benchmark includes rectangles of successively higher precision, a problem for the previous state-of-the-art, which enumerates all locations for placing rectangles. We instead limit these locations and bounding box dimensions to the set of subset sums of the rectangles' dimensions, allowing us to test 4,500 times fewer bounding boxes and solve N=9 over two orders of magnitude faster. Finally, on the open problem of the feasibility of packing a specific infinite series of rectangles into the unit square, we pack the first 50,000 such rectangles and conjecture that the entire infinite series can fit.