Query-Based Adversarial Prompt Generation
–Neural Information Processing Systems
These attacks allow an adversary to cause an otherwise "aligned" model--that typically refuses requests such as "how do I build a bomb?" or "swear at me!"--to comply with such requests, or
Neural Information Processing Systems
Oct-10-2025, 20:00:40 GMT
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Information Technology > Security & Privacy (0.68)
- Technology: