Privacy-Preserving Transformers: SwiftKey's Differential Privacy Implementation

Abouelenin, Abdelrahman, Abdelrehim, Mohamed, Fahim, Raffy, Hendy, Amr, Afify, Mohamed

May-12-2025–arXiv.org Artificial Intelligence

In this paper we train a transformer using differential privacy (DP) for language modeling in SwiftKey. We run multiple experiments to balance the trade-off between the model size and run-time speed and accuracy. We show that we get small and consistent gains in the next-word-prediction and accuracy with graceful increase in memory and speed compared to the production GRU. This is obtained by scaling down a GPT2 architecture to fit the required size and a two stage training process that builds a seed model on general data and DP finetunes it on typing data. The transformer is integrated using ONNX offering both flexibility and efficiency.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

May-12-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report (0.64)

Industry:
- Information Technology > Security & Privacy (0.94)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found