On the Impact of Noise in Differentially Private Text Rewriting
Meisenbacher, Stephen, Chevli, Maulik, Matthes, Florian
–arXiv.org Artificial Intelligence
The field of text privatization often leverages the notion of $\textit{Differential Privacy}$ (DP) to provide formal guarantees in the rewriting or obfuscation of sensitive textual data. A common and nearly ubiquitous form of DP application necessitates the addition of calibrated noise to vector representations of text, either at the data- or model-level, which is governed by the privacy parameter $\varepsilon$. However, noise addition almost undoubtedly leads to considerable utility loss, thereby highlighting one major drawback of DP in NLP. In this work, we introduce a new sentence infilling privatization technique, and we use this method to explore the effect of noise in DP text rewriting. We empirically demonstrate that non-DP privatization techniques excel in utility preservation and can find an acceptable empirical privacy-utility trade-off, yet cannot outperform DP methods in empirical privacy protections. Our results highlight the significant impact of noise in current DP rewriting mechanisms, leading to a discussion of the merits and challenges of DP in NLP, as well as the opportunities that non-DP methods present.
arXiv.org Artificial Intelligence
Jan-31-2025
- Country:
- Africa
- Kenya (0.04)
- South Africa (0.04)
- Zimbabwe (0.04)
- Asia
- China (0.04)
- India > Tamil Nadu
- Chennai (0.04)
- Japan > Honshū
- Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Middle East > Iran
- Tehran Province > Tehran (0.04)
- Russia (0.04)
- Singapore (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Europe
- Portugal (0.04)
- Czechia > Prague (0.04)
- United Kingdom
- England
- Greater London > London (0.04)
- Greater Manchester > Manchester (0.04)
- Lancashire > Blackpool (0.04)
- Merseyside > Liverpool (0.04)
- North Sea > Central North Sea (0.04)
- Scotland (0.04)
- England
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Sweden (0.04)
- Belgium (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Russia (0.04)
- Italy (0.04)
- France (0.04)
- Norway (0.04)
- Denmark (0.04)
- Germany
- Bavaria > Upper Bavaria
- Munich (0.04)
- North Rhine-Westphalia > Cologne Region
- Bonn (0.04)
- Bavaria > Upper Bavaria
- Austria > Salzburg
- Salzburg (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- North America
- Canada
- Manitoba (0.04)
- Ontario
- Middlesex County > London (0.04)
- Toronto (0.04)
- Dominican Republic (0.04)
- Mexico (0.04)
- United States
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- New York > New York County
- New York City (0.04)
- Washington > King County
- Seattle (0.14)
- Illinois (0.04)
- Virginia (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- New Jersey (0.04)
- Iowa (0.04)
- Tennessee (0.04)
- California > Los Angeles County
- Los Angeles > Hollywood > West Hollywood (0.04)
- Oklahoma > Payne County
- Cushing (0.04)
- Utah (0.04)
- Texas > Travis County
- Austin (0.04)
- Pennsylvania > Allegheny County
- Canada
- Oceania
- Australia > Victoria
- Melbourne (0.04)
- New Zealand (0.04)
- Australia > Victoria
- Africa
- Genre:
- Research Report > New Finding (0.87)
- Industry:
- Energy (0.94)
- Government (1.00)
- Information Technology > Security & Privacy (1.00)
- Leisure & Entertainment > Sports
- Soccer (1.00)
- Media > Music (1.00)
- Technology: