ProDiF: Protecting Domain-Invariant Features to Secure Pre-Trained Models Against Extraction
Zhou, Tong, Duan, Shijin, Liu, Gaowen, Fleming, Charles, Kompella, Ramana Rao, Ren, Shaolei, Xu, Xiaolin
–arXiv.org Artificial Intelligence
Pre-trained models are valuable intellectual property, capturing both domainspecific and domain-invariant features within their weight spaces. However, model extraction attacks threaten these assets by enabling unauthorized source-domain inference and facilitating cross-domain transfer via the exploitation of domaininvariant features. In this work, we introduce ProDiF, a novel framework that leverages targeted weight space manipulation to secure pre-trained models against extraction attacks. ProDiF quantifies the transferability of filters and perturbs the weights of critical filters in unsecured memory, while preserving actual critical weights in a Trusted Execution Environment (TEE) for authorized users. A bilevel optimization further ensures resilience against adaptive fine-tuning attacks. Experimental results show that ProDiF reduces source-domain accuracy to nearrandom levels and decreases cross-domain transferability by 74.65%, providing robust protection for pre-trained models. This work offers comprehensive protection for pre-trained DNN models and highlights the potential of weight space manipulation as a novel approach to model security. Pre-trained deep neural networks (DNNs) excel across diverse tasks and encapsulate rich information in their weight spaces, making them valuable intellectual property (IP) Xue et al. (2021). However, this richness also makes them vulnerable to model extraction attacks Sun et al. (2021); Dubey et al. (2022), enabling attackers to perform source-domain inference using the extracted models. To counter this, prior works use Trusted Execution Environments (TEEs) to protect critical model weights, ensuring attackers obtain incomplete parameters, degrading source-domain performance Chakraborty et al. (2020); Zhou et al. (2023).
arXiv.org Artificial Intelligence
Mar-17-2025
- Genre:
- Research Report > New Finding (0.48)
- Industry:
- Information Technology > Security & Privacy (0.68)
- Technology: