Stronger Re-identification Attacks through Reasoning and Aggregation

Charpentier, Lucas Georges Gabriel, Lison, Pierre

Oct-13-2025–arXiv.org Artificial Intelligence

Text de-identification techniques are often used to mask personally identifiable information (PII) from documents. Their ability to conceal the identity of the individuals mentioned in a text is, however, hard to measure. Recent work has shown how the robustness of de-identification methods could be assessed by attempting the reverse process of _re-identification_, based on an automated adversary using its background knowledge to uncover the PIIs that have been masked. This paper presents two complementary strategies to build stronger re-identification attacks. We first show that (1) the _order_ in which the PII spans are re-identified matters, and that aggregating predictions across multiple orderings leads to improved results. We also find that (2) reasoning models can boost the re-identification performance, especially when the adversary is assumed to have access to extensive background knowledge.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Oct-13-2025

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - Israel > Central District
    - Ramla (0.04)
  - UAE > Abu Dhabi Emirate
    - Abu Dhabi (0.04)
- Europe
  - Austria > Vienna (0.14)
  - Norway > Eastern Norway
    - Oslo (0.04)
- North America
  - Montserrat (0.05)
  - United States (0.05)
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre:
- Research Report (0.51)

Industry:
- Information Technology > Security & Privacy (0.94)
- Law (0.94)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.46)
    - Natural Language > Large Language Model (0.95)
    - Representation & Reasoning (0.68)
  - Security & Privacy (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found