Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning
Kormushev, Petar, Nomoto, Kohei, Dong, Fangyan, Hirota, Kaoru
–arXiv.org Artificial Intelligence
General RL algorithms like Q-learning [17], SARSA and TD(λ) [15] have been proved to converge to the globally optimal solution (under certain assumptions) [1][17]. They are very flexible, because they do not require a model of the environment, and have been shown to be effective in solving a variety of RL tasks. This flexibility, however, comes at a certain cost: these RL algorithms require extremely long training to cope with large state space problems. Many different approaches have been proposed for speeding up the RL process. One possible technique is to use function approximation [8], in order to reduce the effect of the "curse of dimensionality".
arXiv.org Artificial Intelligence
Apr-3-2009
- Country:
- North America > United States
- Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia > Japan
- Honshū > Kantō
- Tokyo Metropolis Prefecture > Tokyo (0.14)
- Kanagawa Prefecture > Yokohama (0.04)
- Honshū > Kantō
- North America > United States
- Genre:
- Research Report (0.64)
- Technology: