Safe Policy Exploration Improvement via Subgoals