Empirical Calibration and Metric Differential Privacy in Language Models
Faustini, Pedro, Fernandes, Natasha, McIver, Annabelle, Dras, Mark
–arXiv.org Artificial Intelligence
NLP models trained with differential privacy (DP) usually adopt the DP-SGD framework, and privacy guarantees are often reported in terms of the privacy budget $\epsilon$. However, $\epsilon$ does not have any intrinsic meaning, and it is generally not possible to compare across variants of the framework. Work in image processing has therefore explored how to empirically calibrate noise across frameworks using Membership Inference Attacks (MIAs). However, this kind of calibration has not been established for NLP. In this paper, we show that MIAs offer little help in calibrating privacy, whereas reconstruction attacks are more useful. As a use case, we define a novel kind of directional privacy based on the von Mises-Fisher (VMF) distribution, a metric DP mechanism that perturbs angular distance rather than adding (isotropic) Gaussian noise, and apply this to NLP architectures. We show that, even though formal guarantees are incomparable, empirical privacy calibration reveals that each mechanism has different areas of strength with respect to utility-privacy trade-offs.
arXiv.org Artificial Intelligence
Mar-17-2025
- Country:
- South America > Brazil
- Rio Grande do Sul (0.04)
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Washington > King County
- Seattle (0.14)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- New York > New York County
- New York City (0.05)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Indiana > Monroe County
- Bloomington (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- California
- Santa Barbara County > Santa Barbara (0.04)
- Orange County > Anaheim (0.04)
- Washington > King County
- Canada
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Europe
- Czechia > Prague (0.04)
- France (0.04)
- United Kingdom > England
- Surrey > Guildford (0.04)
- Oxfordshire > Oxford (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Faroe Islands > Streymoy
- Tórshavn (0.04)
- Estonia > Tartu County
- Tartu (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Nepal (0.04)
- Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Israel > Haifa District
- Haifa (0.04)
- China > Beijing
- Beijing (0.04)
- Africa > Middle East
- Egypt > Cairo Governorate > Cairo (0.04)
- South America > Brazil
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Government (0.67)
- Technology: