A Critical Review on the Use (and Misuse) of Differential Privacy in Machine Learning
Blanco-Justicia, Alberto, Sanchez, David, Domingo-Ferrer, Josep, Muralidhar, Krishnamurty
–arXiv.org Artificial Intelligence
As long ago as the 1970s, official statisticians [Dalenius(1977)] began to worry about potential disclosure of private information on people or companies linked to the publication of statistical outputs. This ushered in the statistical disclosure control (SDC) discipline [Hundepool et al.(2012)], whose goal is to provide methods for data anonymization. Also related to SDC is randomized response (RR, [Warner(1965)]), which was designed in the 1960s as a mechanism to eliminate evasive answer bias in surveys and turned out to be very useful for anonymization. The usual approach to anonymization in official statistics is utility-first: anonymization parameters are iteratively tried until a parameter choice is found that preserves sufficient analytical utility while reducing below a certain threshold the risk of disclosing confidential information on specific respondents. Both utility and privacy are evaluated ex post by respectively measuring the information loss and the probability of re-identification of the anonymized outputs.
arXiv.org Artificial Intelligence
Jul-5-2022
- Country:
- Europe (0.67)
- North America > United States (0.46)
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: