Sample-efficient Cross-Entropy Method for Real-time Planning

Pinneri, Cristina, Sawant, Shambhuraj, Blaes, Sebastian, Achterhold, Jan, Stueckler, Joerg, Rolinek, Michal, Martius, Georg

Aug-14-2020–arXiv.org Machine Learning

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.

real-time planning, sample-efficient cross-entropy method

arXiv.org Machine Learning

Aug-14-2020

arXiv.org Web Page

Add feedback

Genre:
- Research Report (0.40)

Technology:
- Information Technology
  - Architecture > Real Time Systems (0.60)
  - Artificial Intelligence > Machine Learning (0.89)