Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

Sinclair, Sean R., Banerjee, Siddhartha, Yu, Christina Lee

Oct-17-2019–arXiv.org Machine Learning

We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces. Our algorithm is based on a novel Q-learning policy with adaptive data-driven discretization. The central idea is to maintain a finer partition of the state-action space in regions which are frequently visited in historical trajectories, and have higher payoff estimates. We demonstrate how our adaptive partitions take advantage of the shape of the optimal $Q$-function and the joint space, without sacrificing the worst-case performance. In particular, we recover the regret guarantees of prior algorithms for continuous state-action spaces, which however require either an optimal discretization as input, and/or access to a simulation oracle. Moreover, experiments demonstrate how our algorithm automatically adapts to the underlying structure of the problem, resulting in much better performance compared both to heuristics, as well as $Q$-learning with uniform discretization.

algorithm, artificial intelligence, upstream oil & gas, (20 more...)

arXiv.org Machine Learning

Oct-17-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)
- Europe > France (0.14)
- Asia (0.14)

Genre:
- Research Report (1.00)

Industry:
- Energy > Oil & Gas > Upstream (0.67)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Supervised Learning > Representation Of Examples (0.42)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found