SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning

Open in new window