FusionDP: Foundation Model-Assisted Differentially Private Learning for Partially Sensitive Features
Zeng, Linghui, Liu, Ruixuan, Sarkar, Atiquer Rahman, Jiang, Xiaoqian, Ho, Joyce C., Xiong, Li
–arXiv.org Artificial Intelligence
Ensuring the privacy of sensitive training data is crucial in privacy-preserving machine learning. However, in practical scenarios, privacy protection may be required for only a subset of features. For instance, in ICU data, demographic attributes like age and gender pose higher privacy risks due to their re-identification potential, whereas raw lab results are generally less sensitive. Traditional DP-SGD enforces privacy protection on all features in one sample, leading to excessive noise injection and significant utility degradation. We propose FusionDP, a two-step framework that enhances model utility under feature-level differential privacy. First, FusionDP leverages large foundation models to impute sensitive features given non-sensitive features, treating them as external priors that provide high-quality estimates of sensitive attributes without accessing the true values during model training. Second, we introduce a modified DP-SGD algorithm that trains models on both original and imputed features while formally preserving the privacy of the original sensitive features. We evaluate FusionDP on two modalities: a sepsis prediction task on tabular data from PhysioNet and a clinical note classification task from MIMIC-III. By comparing against privacy-preserving baselines, our results show that FusionDP significantly improves model performance while maintaining rigorous feature-level privacy, demonstrating the potential of foundation model-driven imputation to enhance the privacy-utility trade-off for various modalities.
arXiv.org Artificial Intelligence
Nov-7-2025
- Country:
- Asia > Middle East
- Israel (0.04)
- North America
- Canada > Manitoba (0.04)
- United States
- Maryland > Montgomery County
- Bethesda (0.04)
- Texas (0.04)
- Maryland > Montgomery County
- Asia > Middle East
- Genre:
- Research Report
- Experimental Study (0.93)
- New Finding (1.00)
- Research Report
- Industry:
- Technology: