Goto

Collaborating Authors

 Planning & Scheduling


Mining Expert Play to Guide Monte Carlo Search in the Opening Moves of Go

AAAI Conferences

We propose a method to guide a Monte Carlo search in the initial moves of the game of Go. Our method matches the current state of a Go board against clusters of board configurations that are derived from a large number of games played by experts. The main advantage of this method is that it does not require an exact match of the current board, and hence is effective for a longer sequence of moves compared to traditional opening books. We apply this method to two different open-source Go-playing programs. Our experiments show that this method, through its filtering or biasing the choice of a next move to a small subset of possible moves, improves play effectively in the initial moves of a game.


Efficient Search with an Ensemble of Heuristics

AAAI Conferences

Recently, a number of papers have shown that for many domains, using multiple heuristics in independent searches performs better than combining them into a single heuristic. Furthermore, using a large number of โ€œweakโ€ heuristics could potentially eliminate the need for the careful design of a few. The standard approach to distribute computation in these multi-heuristic searches is to rotate through the heuristics in a round-robin fashion. However, this strategy can be inefficient especially in the case when only a few of the heuristics are leading to progress. In this paper, we present two principled methods to adaptively distribute computation time among the different searches of the Multi- Heuristic A* algorithm. The first method, Meta-A*, constructs and searches a meta-graph, which represents the problem of finding the best heuristic as the problem of minimizing the total number of expansions. The second treats the scheduling of searches with different heuristics as a multi-armed bandit problem. It applies Dynamic Thompson Sampling (DTS) to keep track of what searches are making progress the most and continuously re-computes the schedule of searches based on this information. We provide a theoretical analysis and compare our new strategies with the round-robin method on a 12-DOF full-body motion planning problem and on sliding tile puzzle problems. In these experiments, we used up to 20 heuristics and observed a several times speedup without loss in solution quality.


Interplanetary Trajectory Planning with Monte Carlo Tree Search

AAAI Conferences

Planning an interplanetary trajectory is a very complex task, traditionally accomplished by domain experts using computer-aided design tools. Recent advances in trajectory optimization allow automation of part of the trajectory design but have yet to provide an efficient way to select promising planetary encounter sequences. In this work, we present a heuristic-free approach to automated trajectory planning (including the encounter sequence planning) based on Monte Carlo Tree Search (MCTS). We discuss a number of modifications to traditional MCTS unique to the domain of interplanetary trajectory planning and provide results on the Rosetta and Cassini-Huygens interplanetary mission design problems. The resulting heuristic-free method is found to be orders of magnitude more efficient with respect to a standard tree search with heuristic-based pruning which is the current state-of-the art in this domain.


A Fast Goal Recognition Technique Based on Interaction Estimates

AAAI Conferences

Goal Recognition is the task of inferring an actor's goals given some or all of the actor's observed actions. There is considerable interest in Goal Recognition for use in intelligent personal assistants, smart environments, intelligent tutoring systems, and monitoring user's needs. In much of this work, the actor's observed actions are compared against a generated library of plans. Recent work by Ramirez and Geffner makes use of AI planning to determine how closely a sequence of observed actions matches plans for each possible goal. For each goal, this is done by comparing the cost of a plan for that goal with the cost of a plan for that goal that includes the observed actions. This approach yields useful rankings, but is impractical for real-time goal recognition in large domains because of the computational expense of constructing plans for each possible goal. In this paper, we introduce an approach that propagates cost and interaction information in a plan graph, and uses this information to estimate goal probabilities. We show that this approach is much faster, but still yields high quality results.


Generalized Rapid Action Value Estimation

AAAI Conferences

Monte Carlo Tree Search (MCTS) is the state of the art algorithm for many games including the game of Go and General Game Playing (GGP). The standard algorithm for MCTS is Upper Confidence bounds applied to Trees (UCT). For games such as Go a big improvement over UCT is the Rapid Action Value Estimation (RAVE) heuristic. We propose to generalize the RAVE heuristic so as to have more accurate estimates near the leaves. We test the resulting algorithm named GRAVE for Atarigo, Knighthrough, Domineering and Go.


Smooth UCT Search in Computer Poker

AAAI Conferences

They concluded that UCT quickly finds Self-play Monte Carlo Tree Search (MCTS) has a good but suboptimal policy, while Outcome Sampling initially been successful in many perfect-information twoplayer learns more slowly but converges to the optimal policy games. Although these methods have been over time. In this paper, we address the question whether the extended to imperfect-information games, so far inability of UCT to converge to a Nash equilibrium can be they have not achieved the same level of practical overcome while retaining UCT's fast initial learning rate. We success or theoretical convergence guarantees focus on the full-game MCTS setting, which is an important as competing methods. In this paper we step towards developing sound variants of online MCTS in introduce Smooth UCT, a variant of the established imperfect-information games. Upper Confidence Bounds Applied to Trees In particular, we introduce Smooth UCT, which combines (UCT) algorithm.


Agile Planning for Real-World Disaster Response

AAAI Conferences

However, as pointed out by [Moran et al., 2013], such We consider a setting where an agent-based planner assumptions simply do not hold in reality. The environment instructs teams of human emergency responders to is typically prone to significant uncertainties and humans may perform tasks in the real world. Due to uncertainty reject plans suggested by a software agent if they are tired or in the environment and the inability of the planner prefer to work with specific partners. Now, a naรฏve solution to consider all human preferences and all attributes to this would involve re-planning every time a rejection is of the real-world, humans may reject plans received. However, this may instead result in a high computational computed by the agent. A naรฏve solution that replans cost (as a whole new plan needs to be computed for given a rejection is inefficient and does not the whole team), may generate a plan that is still not acceptable, guarantee the new plan will be acceptable. Hence, and, following multiple rejection/replanning cycles (as we propose a new model re-planning problem using all individual team members need to accept the new plan), a Multi-agent Markov Decision Process that may lead the teams to suboptimal solutions.


The Grid-Based Path Planning Competition: 2014 Entries and Results

AAAI Conferences

The Grid-Based Path Planning Competition has just completed its third iteration. The entriesused in the competition have improved significantly during this time, changing the view ofthe state of the art of grid-based pathfinding. Furthermore, the entries from the competition have beenmade publicly available, improving the ability of researchers to compare their work. Thispaper summarizes the entries to the 2014 competition, presents the 2014 competition results,and talks about what has been learned and where there is room for improvement.


Moving Target Search with Subgoal Graphs

AAAI Conferences

Moving Target Search (MTS) is a dynamic path planning problem, where an agent is trying to reach a moving entity with a minimum path cost. Problems of this nature can be found in video games and dynamic robotics, which require fast processing time (real time). In this work, we introduce a new algorithm for this problem - the Moving Target Search with Subgoal Graphs (MTSub). MTSub is based on environment abstraction and uses Subgoal Graphs to speed up the searches for a minimal cost route to the target. The algorithm is optimal with respect to the knowledge that the agent has during the search. Experimental results show that MTSub can be used in real-time applications (e.g., applications requiring 5 microseconds response time per step). The experiments compared MTSub to G-FRA*, which is the best known dynamic algorithm so far, showing that MTSub is up to 29 times faster in average time per step, and up to 186 times faster in maximum time per step.


Tight Bounds for HTN Planning with Task Insertion (Extended Abstract)

AAAI Conferences

Hierarchical Task Network (HTN) planning with task insertion (TIHTN planning) is a variant of HTN planning. In HTN planning, the only means to alter task networks is to decompose compound tasks. In TIHTN planning, tasks may also be inserted directly. In this paper we provide tight complexity bounds for TIHTN planning along two axis: whether variables are allowed and whether methods must be totally ordered.