PRM-Free Security Alignment of Large Models via Red Teaming and Adversarial Training

Open in new window