Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

Neural Information Processing Systems 

Data are drawn from four sources (see 3.1) to maximize coverage, with a careful balance of prompt

Similar Docs  Excel Report  more

TitleSimilaritySource
None found