PrLM: Learning Explicit Reasoning for Personalized RAG via Contrastive Reward Optimization

Open in new window