Discovering Options for Exploration by Minimizing Cover Time

Jinnai, Yuu, Park, Jee Won, Abel, David, Konidaris, George

Mar-16-2019–arXiv.org Artificial Intelligence

Finding a set of edges that minimizes expected One of the main challenges in reinforcement learning cover time is an extremely hard combinatorial optimization is solving tasks with sparse reward. We show problem (Braess, 1968; Braess et al., 2005). Thus, our that the difficulty of discovering a distant rewarding algorithm instead seeks to minimize the upper bound of the state in an MDP is bounded by the expected expected cover time given as a function of the algebraic cover time of a random walk over the graph induced connectivity of the graph Laplacian (Fiedler, 1973; Broder by the MDP's transition dynamics. We & Karlin, 1989; Chung, 1996) using the heuristic method therefore propose to accelerate exploration by constructing by Ghosh & Boyd (2006) that improves the upper bound of options that minimize cover time. The the expected cover time of a uniform random walk.

cover time, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

Mar-16-2019

arXiv.org PDF

Add feedback

Country:
- Asia > Vietnam
  - Hanoi > Hanoi (0.05)
- North America > United States
  - Rhode Island > Providence County > Providence (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (0.51)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found