When Simple Exploration is Sample Efficient: Identifying Sufficient Conditions for Random Exploration to Yield PAC RL Algorithms

May-23-2018–arXiv.org Artificial Intelligence

Efficient exploration is one of the key challenges for reinforcement learning (RL) algorithms. Most traditional sample efficiency bounds require strategic exploration. Recently many deep RL algorithm with simple heuristic exploration strategies that have few formal guarantees, achieve surprising success in many domains. These results pose an important question about understanding these exploration strategies such as $e$-greedy, as well as understanding what characterize the difficulty of exploration in MDPs. In this work we propose problem specific sample complexity bounds of $Q$ learning with random walk exploration that rely on several structural properties. We also link our theoretical results to some empirical benchmark domains, to illustrate if our bound gives polynomial sample complexity or not in these domains and how that is related with the empirical performance in these domains.

artificial intelligence, probability, upstream oil & gas, (19 more...)

arXiv.org Artificial Intelligence

May-23-2018

arXiv.org PDF

Add feedback

Country:
- North America > United States > New Jersey (0.14)

Genre:
- Research Report (0.50)

Industry:
- Energy > Oil & Gas > Upstream (0.54)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found