PAG: Multi-Turn Reinforced LLM Self-Correction with Policy as Generative Verifier