Beyond the Boundaries of Proximal Policy Optimization