Improving Exploration in UCT Using Local Manifolds

Srinivasan, Sriram (University of Alberta) | Talvitie, Erik (Franklin and Marshal College) | Bowling, Michael (University of Alberta)

Mar-6-2015–AAAI Conferences

Monte-Carlo planning has been proven successful in manysequential decision-making settings, but it suffers from poorexploration when the rewards are sparse. In this paper, weimprove exploration in UCT by generalizing across similarstates using a given distance metric. We show that this algorithm,like UCT, converges asymptotically to the optimalaction. When the state space does not have a natural distancemetric, we show how we can learn a local manifold from thetransition graph of states in the near future. to obtain a distancemetric. On domains inspired by video games, empiricalevidence shows that our algorithm is more sample efficientthan UCT, particularly when rewards are sparse.

artificial intelligence, manifold, planning & scheduling, (20 more...)

AAAI Conferences

Mar-6-2015

Conferences PDF

Add feedback

Country:
- North America > Canada > Alberta (0.14)

Industry:
- Leisure & Entertainment > Games (0.89)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Representation & Reasoning
    - Agents (0.93)
    - Planning & Scheduling (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found