Dynamic Regret of Policy Optimization in Non-Stationary Environments