A Tractable Algorithm For Finite-Horizon Continuous Reinforcement Learning

Gampa, Phanideep, Kondamudi, Sairam Satwik, Kailasam, Lakshmanan

Jun-26-2019–arXiv.org Artificial Intelligence

We consider the finite horizon continuous reinforcement learning problem. Our contribution is three-fold. First,we give a tractable algorithm based on optimistic value iteration for the problem. Next,we give a lower bound on regret of order $\Omega(T^{2/3})$ for any algorithm discretizes the state space, improving the previous regret bound of $\Omega(T^{1/2})$ of Ortner and Ryabko \cite{contrl} for the same problem. Next,under the assumption that the rewards and transitions are H\"{o}lder Continuous we show that the upper bound on the discretization error is $const.Ln^{-\alpha}T$. Finally,we give some simple experiments to validate our propositions.

algorithm, artificial intelligence, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

Jun-26-2019

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found