Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
–Neural Information Processing Systems
As large language models (LLMs) become increasingly prevalent across many realworld applications, understanding and enhancing their robustness to adversarial attacks is of paramount importance. Existing methods for identifying adversarial prompts tend to focus on specific domains, lack diversity, or require extensive human annotations.
Neural Information Processing Systems
May-30-2025, 12:37:53 GMT
- Country:
- Europe (1.00)
- North America
- Canada (0.28)
- United States (0.28)
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Government > Military (1.00)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (1.00)
- Technology: