PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models
Li, Guangwei, Zhang, Yuansen, Wang, Yinggui, Yan, Shoumeng, Wang, Lei, Wei, Tao
–arXiv.org Artificial Intelligence
The rapid development of large language models (LLMs) is redefining the landscape of human-computer interaction, and their integration into various user-service applications is becoming increasingly prevalent. However, transmitting user data to cloud-based LLMs presents significant risks of data breaches and unauthorized access to personal identification information. In this paper, we propose a privacy preservation pipeline for protecting privacy and sensitive information during interactions between users and LLMs in practical LLM usage scenarios. We construct SensitiveQA, the first privacy open-ended question-answering dataset. It comprises 57k interactions in Chinese and English, encompassing a diverse range of user-sensitive information within the conversations. Our proposed solution employs a multi-stage strategy aimed at preemptively securing user information while simultaneously preserving the response quality of cloud-based LLMs. Experimental validation underscores our method's efficacy in balancing privacy protection with maintaining robust interaction quality. The code and dataset are available at https://github.com/ligw1998/PRIV-QA.
arXiv.org Artificial Intelligence
Feb-19-2025
- Country:
- Europe > Spain (0.14)
- North America > United States
- Michigan (0.14)
- Oceania > Australia (0.14)
- Genre:
- Research Report (0.64)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: