A Critical Review on the Use (and Misuse) of Differential Privacy in Machine Learning

Blanco-Justicia, Alberto, Sanchez, David, Domingo-Ferrer, Josep, Muralidhar, Krishnamurty

arXiv.org Artificial Intelligence 

As long ago as the 1970s, official statisticians [Dalenius(1977)] began to worry about potential disclosure of private information on people or companies linked to the publication of statistical outputs. This ushered in the statistical disclosure control (SDC) discipline [Hundepool et al.(2012)], whose goal is to provide methods for data anonymization. Also related to SDC is randomized response (RR, [Warner(1965)]), which was designed in the 1960s as a mechanism to eliminate evasive answer bias in surveys and turned out to be very useful for anonymization. The usual approach to anonymization in official statistics is utility-first: anonymization parameters are iteratively tried until a parameter choice is found that preserves sufficient analytical utility while reducing below a certain threshold the risk of disclosing confidential information on specific respondents. Both utility and privacy are evaluated ex post by respectively measuring the information loss and the probability of re-identification of the anonymized outputs.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found