Imitate Optimal Policy: Prevail and Induce Action Collapse in Policy Gradient

Open in new window