DP-NMT: Scalable Differentially-Private Machine Translation
Igamberdiev, Timour, Vu, Doan Nam Long, Künnecke, Felix, Yu, Zhuo, Holmer, Jannik, Habernal, Ivan
–arXiv.org Artificial Intelligence
Neural machine translation (NMT) is a widely popular text generation task, yet there is a considerable research gap in the development of privacy-preserving NMT models, despite significant data privacy concerns for NMT systems. Differentially private stochastic gradient descent (DP-SGD) is a popular method for training machine learning models with concrete privacy guarantees; however, the implementation specifics of training a model with DP-SGD are not always clarified in existing models, with differing software libraries used and code bases not always being public, leading to reproducibility issues. To tackle this, we introduce DP-NMT, an open-source framework for carrying out research on privacy-preserving NMT with DP-SGD, bringing together numerous models, datasets, and evaluation metrics in one systematic software package. Our goal is to provide a platform for researchers to advance the development of privacy-preserving NMT systems, keeping the specific details of the DP-SGD algorithm transparent and intuitive to implement. We run a set of experiments on datasets from both general and privacy-related domains to demonstrate our framework in use. We make our framework publicly available and welcome feedback from the community.
arXiv.org Artificial Intelligence
Nov-24-2023
- Country:
- Oceania > Australia
- North America
- Dominican Republic (0.05)
- United States > Washington
- King County > Seattle (0.04)
- Canada > Ontario
- Toronto (0.04)
- Europe
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Germany > Hesse
- Darmstadt Region > Darmstadt (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Ireland > Leinster
- Asia
- China > Hong Kong (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.14)
- Genre:
- Research Report (0.64)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: