PAC Reinforcement Learning without Real-World Feedback

Zhong, Yuren, Deshmukh, Aniket Anand, Scott, Clayton

Sep-24-2019–arXiv.org Machine Learning

This work studies reinforcement learning in the Sim-to-Real setting, in which an agent is first trained on a number of simulators before being deployed in the real world, with the aim of decreasing the real-world sample complexity requirement. Using a dynamic model known as a rich observation Markov decision process (ROMDP), we formulate a theoretical framework for Sim-to-Real in the situation where feedback in the real world is not available. We establish real-world sample complexity guarantees that are smaller than what is currently known for directly (i.e., without access to simulators) learning a ROMDP with feedback.

artificial intelligence, probability, reinforcement learning, (16 more...)

arXiv.org Machine Learning

Sep-24-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found