LoRA-Leak: Membership Inference Attacks Against LoRA Fine-tuned Language Models

Ran, Delong, He, Xinlei, Cong, Tianshuo, Wang, Anyu, Li, Qi, Wang, Xiaoyun

Jul-25-2025–arXiv.org Artificial Intelligence

Language Models (LMs) typically adhere to a "pre-training and fine-tuning" paradigm, where a universal pre-trained model can be fine-tuned to cater to various specialized domains. Low-Rank Adaptation (LoRA) has gained the most widespread use in LM fine-tuning due to its lightweight computational cost and remarkable performance. Because the proportion of parameters tuned by LoRA is relatively small, there might be a misleading impression that the LoRA fine-tuning data is invulnerable to Membership Inference Attacks (MIAs). However, we identify that utilizing the pre-trained model can induce more information leakage, which is neglected by existing MIAs. Therefore, we introduce LoRA-Leak, a holistic evaluation framework for MIAs against the fine-tuning datasets of LMs. LoRA-Leak incorporates fifteen membership inference attacks, including ten existing MIAs, and five improved MIAs that leverage the pre-trained model as a reference. In experiments, we apply LoRA-Leak to three advanced LMs across three popular natural language processing tasks, demonstrating that LoRA-based fine-tuned LMs are still vulnerable to MIAs (e.g., 0.775 AUC under conservative fine-tuning settings). We also applied LoRA-Leak to different fine-tuning settings to understand the resulting privacy risks. We further explore four defenses and find that only dropout and excluding specific LM layers during fine-tuning effectively mitigate MIA risks while maintaining utility. We highlight that under the "pre-training and fine-tuning" paradigm, the existence of the pre-trained model makes MIA a more severe risk for LoRA-based LMs. We hope that our findings can provide guidance on data privacy protection for specialized LM providers.

large language model, machine learning, pre-trained model, (20 more...)

arXiv.org Artificial Intelligence

Jul-25-2025

arXiv.org PDF

Add feedback

Country:
- Asia
  - Bangladesh (0.04)
  - China
    - Guangdong Province > Guangzhou (0.04)
    - Hong Kong (0.04)
  - Indonesia > Bali (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)
  - Singapore (0.04)
  - Thailand > Bangkok
    - Bangkok (0.04)
- North America
  - Canada > Ontario
    - Toronto (0.04)
  - Dominican Republic (0.04)
  - United States (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Performance Analysis > Accuracy (0.93)
  - Natural Language
    - Chatbot (1.00)
    - Large Language Model (1.00)