Backdoor Cleaning without External Guidance in MLLM Fine-tuning
–Neural Information Processing Systems
Multimodal Large Language Models (MLLMs) are increasingly deployed in finetuning-as-a-service (FTaaS) settings, where user-submitted datasets adapt generalpurpose models to downstream tasks. This flexibility, however, introduces serious security risks, as malicious fine-tuning can implant backdoors into MLLMs with minimal effort. In this paper, we observe that backdoor triggers systematically disrupt cross-modal processing by causing abnormal attention concentration on non-semantic regions--a phenomenon we term attention collapse. Based on this insight, we propose Believe Your Eyes (BYE), a data filtering framework that leverages attention entropy patterns as self-supervised signals to identify and filter backdoor samples. BYE operates via a three-stage pipeline: (1) extracting attention maps using the fine-tuned model, (2) computing entropy scores and profiling sensitive layers via bimodal separation, and (3) performing unsupervised clustering to remove suspicious samples. Unlike prior defenses, BYE requires no clean supervision, auxiliary labels, or model modifications. Extensive experiments across various datasets, models, and diverse trigger types validate BYE's effectiveness: it achieves near-zero attack success rates while maintaining clean-task performance, offering a robust and generalizable solution against backdoor threats in MLLMs.
Neural Information Processing Systems
Jun-15-2026, 16:30:10 GMT
- Country:
- North America (0.67)
- Asia > China (0.28)
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Industry:
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.93)
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Natural Language > Large Language Model (1.00)
- Representation & Reasoning (0.92)
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Statistical Learning > Clustering (0.66)
- Information Technology > Artificial Intelligence