STAR-1: Safer Alignment of Reasoning LLMs with 1K Data

Open in new window