Defending Against Authorship Identification Attacks
–arXiv.org Artificial Intelligence
Authorship identification has proven unsettlingly effective in inferring the identity of the author of an unsigned document, even when sensitive personal information has been carefully omitted. In the digital era, individuals leave a lasting digital footprint through their written content, whether it is posted on social media, stored on their employer's computers, or located elsewhere. When individuals need to communicate publicly yet wish to remain anonymous, there is little available to protect them from unwanted authorship identification. This unprecedented threat to privacy is evident in scenarios such as whistle-blowing. Proposed defenses against authorship identification attacks primarily aim to obfuscate one's writing style, thereby making it unlinkable to their pre-existing writing, while concurrently preserving the original meaning and grammatical integrity. The presented work offers a comprehensive review of the advancements in this research area spanning over the past two decades and beyond. It emphasizes the methodological frameworks of modification and generation-based strategies devised to evade authorship identification attacks, highlighting joint efforts from the differential privacy community. Limitations of current research are discussed, with a spotlight on open challenges and potential research avenues.
arXiv.org Artificial Intelligence
Oct-2-2023
- Country:
- Oceania
- New Zealand (0.04)
- Australia > New South Wales
- Sydney (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Washington > King County
- Seattle (0.14)
- Oregon > Multnomah County
- Portland (0.04)
- New York
- New York County > New York City (0.04)
- Tompkins County > Ithaca (0.04)
- New Mexico
- Santa Fe County > Santa Fe (0.04)
- Doña Ana County > Las Cruces (0.04)
- Indiana > Monroe County
- Bloomington (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- California
- San Diego County > San Diego (0.04)
- Los Angeles County > Pasadena (0.04)
- Washington > King County
- Canada > Ontario
- Toronto (0.04)
- Europe
- Austria > Vienna (0.14)
- Sweden > Stockholm
- Stockholm (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Cambridgeshire > Cambridge (0.04)
- Greece > Attica
- Athens (0.04)
- France
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- Marseille (0.04)
- Hauts-de-France > Nord
- Lille (0.04)
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- Italy
- Middle East
- Republic of Türkiye > Istanbul Province
- Istanbul (0.04)
- Cyprus > Limassol
- Limassol (0.04)
- Republic of Türkiye > Istanbul Province
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Portugal
- Asia
- Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Republic of Türkiye > Istanbul Province
- Istanbul (0.04)
- India
- Maharashtra > Mumbai (0.04)
- Bihar > Patna (0.04)
- China > Shanghai
- Shanghai (0.04)
- Middle East
- Oceania
- Genre:
- Overview (1.00)
- Research Report
- New Finding (1.00)
- Experimental Study (0.67)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Law > Civil Rights & Constitutional Law (0.92)
- Media (0.92)
- Government > Regional Government
- Technology:
- Information Technology
- Security & Privacy (1.00)
- Information Management (1.00)
- Data Science > Data Mining (1.00)
- Communications > Social Media (1.00)
- Artificial Intelligence
- Representation & Reasoning (1.00)
- Natural Language
- Text Processing (1.00)
- Machine Translation (1.00)
- Large Language Model (0.93)
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Statistical Learning > Support Vector Machines (0.67)
- Information Technology