Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems 

Summary The paper presents a new method for path integral control. The proposed method leverages that the system control matrix G is often known, and the uncontrolled dynamics (or dynamics under a reference controller) can be learned using a Gaussian Process. The constraint on the reward function takes a more general shape than in previous PI approaches which means among others that noise and controls can act in different subspaces. The authors also show how their framework can be used for generalizing from known to new tasks, and evaluate the method on three simulated robotics problems, where their method compares favourably to SOTA reinforcement learning and control methods. Quality Generally, the derivations seem correct and principled (although I'm unsure about the task generalization, see below).