Reviews: Towards Generalization and Simplicity in Continuous Control

Oct-8-2024, 05:11:29 GMT–Neural Information Processing Systems

The paper evaluates natural policy gradient algorithm with simple linear policies on a wide range of "challenging" problems from OpenAI MuJoco environment, and shows that these shallow policy networks can learn effective policies in most domains, sometimes faster than NN policies. It further explores learning robust and more global policies by modifying existing domains, e.g. The first part of the paper, while not proposing new approaches, offers interesting insights into the performance of linear policies, given plethora of prior work on applying NN policies as default on these problems. This part can be further strengthened by doing ablation study on the RL optimizer. Specifically, GAE, sigma vs alpha in Eq. 5, and small trajectory batch vs large trajectory batch (SGD vs batch opt).

continuous control, generalization and simplicity, linear policy, (3 more...)

Neural Information Processing Systems

Oct-8-2024, 05:11:29 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.40)