Proximal Policy Gradient: PPO with Policy Gradient