Privacy-Preserving Transformers: SwiftKey's Differential Privacy Implementation
Abouelenin, Abdelrahman, Abdelrehim, Mohamed, Fahim, Raffy, Hendy, Amr, Afify, Mohamed
–arXiv.org Artificial Intelligence
In this paper we train a transformer using differential privacy (DP) for language modeling in SwiftKey. We run multiple experiments to balance the trade-off between the model size and run-time speed and accuracy. We show that we get small and consistent gains in the next-word-prediction and accuracy with graceful increase in memory and speed compared to the production GRU. This is obtained by scaling down a GPT2 architecture to fit the required size and a two stage training process that builds a seed model on general data and DP finetunes it on typing data. The transformer is integrated using ONNX offering both flexibility and efficiency.
arXiv.org Artificial Intelligence
May-12-2025
- Country:
- North America > United States (0.28)
- Genre:
- Research Report (0.64)
- Industry:
- Information Technology > Security & Privacy (0.94)
- Technology: