Automated Detection of Pre-training Text in Black-box LLMs

Hu, Ruihan, Shang, Yu-Ming, Peng, Jiankun, Luo, Wei, Wang, Yazhe, Zhang, Xi

Jun-25-2025–arXiv.org Artificial Intelligence

Most existing methods rely on the LLM's hidden information (e.g., model parameters or token probabilities), making them ineffective in the black-box setting, where only input and output texts are accessible. Although some methods have been proposed for the black-box setting, they rely on massive manual efforts such as designing complicated questions or instructions. To address these issues, we propose V eilProbe, the first framework for automatically detecting LLMs' pre-training texts in a black-box setting without human intervention. V eilProbe utilizes a sequence-to-sequence mapping model to infer the latent mapping feature between the input text and the corresponding output suffix generated by the LLM. Then it performs the key token perturbations to obtain more distinguishable membership features. Additionally, considering real-world scenarios where the ground-truth training text samples are limited, a prototype-based membership classifier is introduced to alleviate the overfitting issue. Extensive evaluations on three widely used datasets demonstrate that our framework is effective and superior in the black-box setting.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Jun-25-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.28)

Genre:
- Research Report (0.82)

Industry:
- Transportation > Air (1.00)
- Law (0.93)
- Information Technology > Security & Privacy (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Performance Analysis > Accuracy (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found