Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

Kormushev, Petar, Nomoto, Kohei, Dong, Fangyan, Hirota, Kaoru

arXiv.org Artificial Intelligence 

General RL algorithms like Q-learning [17], SARSA and TD(λ) [15] have been proved to converge to the globally optimal solution (under certain assumptions) [1][17]. They are very flexible, because they do not require a model of the environment, and have been shown to be effective in solving a variety of RL tasks. This flexibility, however, comes at a certain cost: these RL algorithms require extremely long training to cope with large state space problems. Many different approaches have been proposed for speeding up the RL process. One possible technique is to use function approximation [8], in order to reduce the effect of the "curse of dimensionality".

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found