Export Reviews, Discussions, Author Feedback and Meta-Reviews
–Neural Information Processing Systems
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Summary: The paper presents a sample-efficient policy search algorithm for large, continuous reinforcement learning problems. In contrast to existing model-based policy search algorithms, the approach presented in this paper tries to learn local models in form of linear Gaussian controllers. Given the information (rollouts) from these linear local models, a global, nonlinear policy can then be learned using an arbitrary parametrization scheme. The so-called Guided Policy Search approach alternates between (local) trajectory optimization and (global) policy search in an iterative fashion. In their experiments, the authors show that the approach outperforms various state-of-the-art Policy Search methods, e.g., REPS, PILCO etc. Experiments where conducted in (mostly 2D) dynamics simulations involving the continuous control of multi-linked agents.
Neural Information Processing Systems
Oct-3-2025, 03:27:02 GMT