Landscape of Policy Optimization for Finite Horizon MDPs with General State and Action

Open in new window