SRMIR: Shadow Reward Models Based on Introspective Reasoning for LLM Alignment

Open in new window