Privacy-Preserving Transformers: SwiftKey's Differential Privacy Implementation

Abouelenin, Abdelrahman, Abdelrehim, Mohamed, Fahim, Raffy, Hendy, Amr, Afify, Mohamed

arXiv.org Artificial Intelligence 

In this paper we train a transformer using differential privacy (DP) for language modeling in SwiftKey. We run multiple experiments to balance the trade-off between the model size and run-time speed and accuracy. We show that we get small and consistent gains in the next-word-prediction and accuracy with graceful increase in memory and speed compared to the production GRU. This is obtained by scaling down a GPT2 architecture to fit the required size and a two stage training process that builds a seed model on general data and DP finetunes it on typing data. The transformer is integrated using ONNX offering both flexibility and efficiency.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found