Vaccine: Perturbation-aware Alignment for Large Language Models against Harmful Fine-tuning Attack
–Neural Information Processing Systems
Inspired by our findings, we propose V accine, a perturbation-aware alignment technique to mitigate the security risk of users fine-tuning.
Neural Information Processing Systems
Oct-10-2025, 08:30:48 GMT
- Country:
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Health & Medicine > Therapeutic Area
- Immunology (0.40)
- Psychiatry/Psychology (0.46)
- Vaccines (0.50)
- Health & Medicine > Therapeutic Area
- Technology: