gbf
Optimize Planning Heuristics to Rank, not to Estimate Cost-to-Goal
Figure 1: Problem instance where perfect heuristic is not strictly optimally efficient with GBFS. However, the path (A, C,D, E) has cost 10 instead of 11 . Then h is a perfect ranking for GBFS on Γ. Proof. We carry the proof by induction with respect to the number of expanded states. Let's now make the induction step and assume the theorem holds for the first A 0 B 1 C 1 D 2 A 1 1 9 9 1 Figure 2: Problem instance where optimally efficient heuristic does not exists for GBFS.
- Europe > Czechia > Prague (0.05)
- Europe > Slovenia > Central Slovenia > Municipality of Komenda > Komenda (0.05)
- Europe > Czechia > Prague (0.04)
- Europe > Slovenia > Central Slovenia > Municipality of Komenda > Komenda (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- (2 more...)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Optimize Planning Heuristics to Rank, not to Estimate Cost-to-Goal
Figure 1: Problem instance where perfect heuristic is not strictly optimally efficient with GBFS. However, the path (A, C,D, E) has cost 10 instead of 11 . Then h is a perfect ranking for GBFS on Γ. Proof. We carry the proof by induction with respect to the number of expanded states. Let's now make the induction step and assume the theorem holds for the first A 0 B 1 C 1 D 2 A 1 1 9 9 1 Figure 2: Problem instance where optimally efficient heuristic does not exists for GBFS.
- Europe > Czechia > Prague (0.05)
- Europe > Slovenia > Central Slovenia > Municipality of Komenda > Komenda (0.05)
- Europe > Czechia > Prague (0.04)
- Europe > Slovenia > Central Slovenia > Municipality of Komenda > Komenda (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- (2 more...)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Parallel Greedy Best-First Search with a Bound on the Number of Expansions Relative to Sequential Search
Shimoda, Takumi, Fukunaga, Alex
Parallelization of non-admissible search algorithms such as GBFS poses a challenge because straightforward parallelization can result in search behavior which significantly deviates from sequential search. Previous work proposed PUHF, a parallel search algorithm which is constrained to only expand states that can be expanded by some tie-breaking strategy for GBFS. We show that despite this constraint, the number of states expanded by PUHF is not bounded by a constant multiple of the number of states expanded by sequential GBFS with the worst-case tie-breaking strategy. We propose and experimentally evaluate One Bench At a Time (OBAT), a parallel greedy search which guarantees that the number of states expanded is within a constant factor of the number of states expanded by sequential GBFS with some tie-breaking policy.
Extreme Value Monte Carlo Tree Search
Asai, Masataro, Wissow, Stephen
Despite being successful in board games and reinforcement learning (RL), UCT, a Monte-Carlo Tree Search (MCTS) combined with UCB1 Multi-Armed Bandit (MAB), has had limited success in domain-independent planning until recently. Previous work showed that UCB1, designed for $[0,1]$-bounded rewards, is not appropriate for estimating the distance-to-go which are potentially unbounded in $\mathbb{R}$, such as heuristic functions used in classical planning, then proposed combining MCTS with MABs designed for Gaussian reward distributions and successfully improved the performance. In this paper, we further sharpen our understanding of ideal bandits for planning tasks. Existing work has two issues: First, while Gaussian MABs no longer over-specify the distances as $h\in [0,1]$, they under-specify them as $h\in [-\infty,\infty]$ while they are non-negative and can be further bounded in some cases. Second, there is no theoretical justifications for Full-Bellman backup (Schulte & Keller, 2014) that backpropagates minimum/maximum of samples. We identified \emph{extreme value} statistics as a theoretical framework that resolves both issues at once and propose two bandits, UCB1-Uniform/Power, and apply them to MCTS for classical planning. We formally prove their regret bounds and empirically demonstrate their performance in classical planning.
- North America > United States > New Hampshire (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)