Protecting Y our LLMs with Information Bottleneck
–Neural Information Processing Systems
The advent of large language models (LLMs) has revolutionized the field of natural language processing, yet they might be attacked to produce harmful content. Despite efforts to ethically align LLMs, these are often fragile and can be circumvented by jailbreaking attacks through optimized or manual adversarial prompts.
Neural Information Processing Systems
Oct-9-2025, 22:56:03 GMT
- Country:
- Asia > China
- Jiangsu Province > Nanjing (0.04)
- North America > United States
- Pennsylvania (0.04)
- Asia > China
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.92)
- Research Report
- Industry:
- Banking & Finance (1.00)
- Government (1.00)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (1.00)
- Law > Criminal Law (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Technology: