Hierarchical Dual-Strategy Unlearning for Biomedical and Healthcare Intelligence Using Imperfect and Privacy-Sensitive Medical Data

Zhang, Yi, Xu, Tianxiang, Li, Zijian, Zhang, Chao, Zhang, Kunyu, Gao, Zhan, Li, Meinuo, Zhang, Xiaohan, Qi, Qichao, Chen, Bing

arXiv.org Artificial Intelligence 

Abstract--Large language models (LLMs) exhibit exceptional performance but pose substantial privacy risks due to training data memorization, particularly within healthcare contexts involving imperfect or privacy-sensitive patient information. We present a hierarchical dual-strategy framework for selective knowledge unlearning that precisely removes specialized knowledge while preserving fundamental medical competencies. Our approach synergistically integrates geometric-constrained gradient updates to selectively modulate target parameters with concept-aware token-level interventions that distinguish between preservation-critical and unlearning-targeted tokens via a unified four-level medical concept hierarchy. Comprehensive evaluations on the MedMCQA (surgical) and MHQA (anxiety, depression, trauma) datasets demonstrate superior performance, achieving an 82.7% forgetting rate and 88.5% knowledge preservation. Notably, our framework maintains robust privacy guarantees while requiring modification of only 0.1% of parameters, addressing critical needs for regulatory compliance, auditability, and ethical standards in clinical research. Large language models (LLMs) have transformed healthcare informatics, demonstrating remarkable capabilities in medical question-answering and clinical decision support. However, their deployment faces significant challenges when dealing with imperfect medical data, which is characteristically incomplete, insufficiently labelled, imbalanced, or contains annotation noise [4].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found