Privacy Meets Explainability: Managing Confidential Data and Transparency Policies in LLM-Empowered Science
Shanmugarasa, Yashothara, Pan, Shidong, Ding, Ming, Zhao, Dehai, Rakotoarivelo, Thierry
–arXiv.org Artificial Intelligence
As Large Language Models (LLMs) become integral to scientific workflows, concerns over the confidentiality and ethical handling of confidential data have emerged. This paper explores data exposure risks through LLM-powered scientific tools, which can inadvertently leak confidential information, including intellectual property and proprietary data, from scientists' perspectives. We propose "DataShield", a framework designed to detect confidential data leaks, summarize privacy policies, and visualize data flow, ensuring alignment with organizational policies and procedures. Our approach aims to inform scientists about data handling practices, enabling them to make informed decisions and protect sensitive information. Ongoing user studies with scientists are underway to evaluate the framework's usability, trustworthiness, and effectiveness in tackling real-world privacy challenges.
arXiv.org Artificial Intelligence
Apr-15-2025
- Country:
- Europe (0.93)
- North America > United States
- California (0.28)
- Genre:
- Research Report > Experimental Study (0.68)
- Industry:
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.94)
- Technology: