LPPG-RL: Lexicographically Projected Policy Gradient Reinforcement Learning with Subproblem Exploration