Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control

Open in new window