Hierarchical Dual-Strategy Unlearning for Biomedical and Healthcare Intelligence Using Imperfect and Privacy-Sensitive Medical Data

Zhang, Yi, Xu, Tianxiang, Li, Zijian, Zhang, Chao, Zhang, Kunyu, Gao, Zhan, Li, Meinuo, Zhang, Xiaohan, Qi, Qichao, Chen, Bing

Nov-26-2025–arXiv.org Artificial Intelligence

Abstract--Large language models (LLMs) exhibit exceptional performance but pose substantial privacy risks due to training data memorization, particularly within healthcare contexts involving imperfect or privacy-sensitive patient information. We present a hierarchical dual-strategy framework for selective knowledge unlearning that precisely removes specialized knowledge while preserving fundamental medical competencies. Our approach synergistically integrates geometric-constrained gradient updates to selectively modulate target parameters with concept-aware token-level interventions that distinguish between preservation-critical and unlearning-targeted tokens via a unified four-level medical concept hierarchy. Comprehensive evaluations on the MedMCQA (surgical) and MHQA (anxiety, depression, trauma) datasets demonstrate superior performance, achieving an 82.7% forgetting rate and 88.5% knowledge preservation. Notably, our framework maintains robust privacy guarantees while requiring modification of only 0.1% of parameters, addressing critical needs for regulatory compliance, auditability, and ethical standards in clinical research. Large language models (LLMs) have transformed healthcare informatics, demonstrating remarkable capabilities in medical question-answering and clinical decision support. However, their deployment faces significant challenges when dealing with imperfect medical data, which is characteristically incomplete, insufficiently labelled, imbalanced, or contains annotation noise [4].

arxiv preprint arxiv, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

Nov-26-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.94)

Genre:
- Research Report
  - Experimental Study (0.48)
  - New Finding (0.48)

Industry:
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area
  - Psychiatry/Psychology (0.47)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.97)
  - Representation & Reasoning > Ontologies (0.89)
  - Machine Learning > Statistical Learning
    - Gradient Descent (0.35)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found