Single-Agent Policy Tree Search With Guarantees

Laurent Orseau, Levi Lelis, Tor Lattimore, Theophane Weber

Oct-7-2024, 10:26:41 GMT–Neural Information Processing Systems

We introduce two novel tree search algorithms that use a policy to guide search. The first algorithm is a best-first enumeration that uses a cost function that allows us to prove an upper bound on the number of nodes to be expanded before reaching a goal state. We show that this best-first algorithm is particularly well suited for "needle-in-a-haystack" problems. The second algorithm is based on sampling and we prove an upper bound on the expected number of nodes it expands before reaching a set of goal states. We show that this algorithm is better suited for problems where many paths lead to a goal. We validate these tree search algorithms on 1,000 computer-generated levels of Sokoban, where the policy used to guide the search comes from a neural network trained using A3C. Our results show that the policy tree search algorithms we introduce are competitive with a state-of-the-art domain-independent planner that uses heuristic search.

artificial intelligence, machine learning, node, (17 more...)

Neural Information Processing Systems

Oct-7-2024, 10:26:41 GMT

Conferences PDF

Add feedback

Country:
- North America > Canada (0.46)

Genre:
- Research Report > New Finding (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)
  - Representation & Reasoning > Search (1.00)

Duplicate Docs Excel Report

Title
Single-Agent Policy Tree Search With Guarantees
Single-Agent Policy Tree Search With Guarantees

Similar Docs Excel Report more

Title	Similarity	Source
None found