Time manipulation technique for speeding up reinforcement learning in simulations
Kormushev, Petar, Nomoto, Kohei, Dong, Fangyan, Hirota, Kaoru
–arXiv.org Artificial Intelligence
A technique for speeding up reinforcement learning algorithms by using time manipulation is proposed. It is applicable to failure-avoidance control problems running in a computer simulation. Turning the time of the simulation backwards on failure events is shown to speed up the learning by 260% and improve the state space exploration by 12% on the cart-pole balancing task, compared to the conventional Q-learning and Actor-Critic algorithms.
arXiv.org Artificial Intelligence
Mar-27-2009
- Country:
- North America > United States
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Pennsylvania > Allegheny County
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- North America > United States
- Genre:
- Research Report (0.50)
- Technology: