On the Impact of Noise in Differentially Private Text Rewriting
Meisenbacher, Stephen, Chevli, Maulik, Matthes, Florian
–arXiv.org Artificial Intelligence
The field of text privatization often leverages the notion of $\textit{Differential Privacy}$ (DP) to provide formal guarantees in the rewriting or obfuscation of sensitive textual data. A common and nearly ubiquitous form of DP application necessitates the addition of calibrated noise to vector representations of text, either at the data- or model-level, which is governed by the privacy parameter $\varepsilon$. However, noise addition almost undoubtedly leads to considerable utility loss, thereby highlighting one major drawback of DP in NLP. In this work, we introduce a new sentence infilling privatization technique, and we use this method to explore the effect of noise in DP text rewriting. We empirically demonstrate that non-DP privatization techniques excel in utility preservation and can find an acceptable empirical privacy-utility trade-off, yet cannot outperform DP methods in empirical privacy protections. Our results highlight the significant impact of noise in current DP rewriting mechanisms, leading to a discussion of the merits and challenges of DP in NLP, as well as the opportunities that non-DP methods present.
arXiv.org Artificial Intelligence
Jan-31-2025
- Country:
- Asia (1.00)
- Europe > United Kingdom
- England (0.46)
- North America > United States
- Washington > King County > Seattle (0.14)
- Genre:
- Research Report > New Finding (0.87)
- Industry:
- Government (1.00)
- Information Technology > Security & Privacy (1.00)
- Leisure & Entertainment > Sports
- Soccer (1.00)
- Media > Music (1.00)
- Technology: