SeqXGPT: Sentence-Level AI-Generated Text Detection
Wang, Pengyu, Li, Linyang, Ren, Ke, Jiang, Botian, Zhang, Dong, Qiu, Xipeng
–arXiv.org Artificial Intelligence
Widely applied large language models (LLMs) can generate human-like content, raising concerns about the abuse of LLMs. Therefore, it is important to build strong AI-generated text (AIGT) detectors. Current works only consider document-level AIGT detection, therefore, in this paper, we first introduce a sentence-level detection challenge by synthesizing a dataset that contains documents that are polished with LLMs, that is, the documents contain sentences written by humans and sentences modified by LLMs. Then we propose \textbf{Seq}uence \textbf{X} (Check) \textbf{GPT}, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection. These features are composed like \textit{waves} in speech processing and cannot be studied by LLMs. Therefore, we build SeqXGPT based on convolution and self-attention networks. We test it in both sentence and document-level detection challenges. Experimental results show that previous methods struggle in solving sentence-level AIGT detection, while our method not only significantly surpasses baseline methods in both sentence and document-level detection challenges but also exhibits strong generalization capabilities.
arXiv.org Artificial Intelligence
Dec-14-2023
- Country:
- Asia > China (0.14)
- Europe
- North America > United States
- Louisiana (0.14)
- Genre:
- Research Report (1.00)
- Technology: