Imitate Optimal Policy: Prevail and Induce Action Collapse in Policy Gradient