Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Neural Information Processing Systems 

Current methods for identifying adversarial prompts aimed at "attacking" LLMs and eliciting undesirable outputs are limited by several factors.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found