main_final

Neural Information Processing Systems 

Direct policy search serves as one of the workhorses in modern reinforcement learning (RL), and its applications in continuous control tasks have recently attracted increasing attention. In this work, we investigate the convergence theory of policy gradient (PG) methods for learning the linear risk-sensitive and robust controller.

Duplicate Docs Excel Report

Similar Docs  Excel Report  more

TitleSimilaritySource
None found