Revealing Secrets From Pre-trained Models
Rafi, Mujahid Al, Feng, Yuan, Jeon, Hyeran
–arXiv.org Artificial Intelligence
With the growing burden of training deep learning models with large data sets, transfer-learning has been widely adopted in many emerging deep learning algorithms. Transformer models such as BERT are the main player in natural language processing and use transfer-learning as a de facto standard training method. A few big data companies release pre-trained models that are trained with a few popular datasets with which end users and researchers fine-tune the model with their own datasets. Transfer-learning significantly reduces the time and effort of training models. However, it comes at the cost of security concerns. In this paper, we show a new observation that pre-trained models and fine-tuned models have significantly high similarities in weight values. Also, we demonstrate that there exist vendor-specific computing patterns even for the same models. With these new findings, we propose a new model extraction attack that reveals the model architecture and the pre-trained model used by the black-box victim model with vendor-specific computing patterns and then estimates the entire model weights based on the weight value similarities between the fine-tuned model and pre-trained model. We also show that the weight similarity can be leveraged for increasing the model extraction feasibility through a novel weight extraction pruning.
arXiv.org Artificial Intelligence
Jul-19-2022
- Country:
- North America > United States > California > Merced County > Merced (0.04)
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: