Chasing Moving Targets with Online Self-Play Reinforcement Learning for Safer Language Models

Open in new window