DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains

Chen, Zhihui, He, Kai, Huang, Yucheng, Zhu, Yunxiao, Feng, Mengling

Jun-10-2025–arXiv.org Artificial Intelligence

Detecting LLM-generated text in specialized and high-stakes domains like medicine and law is crucial for combating misinformation and ensuring authenticity. However, current zero-shot detectors, while effective on general text, often fail when applied to specialized content due to domain shift. We provide a theoretical analysis showing this failure is fundamentally linked to the KL divergence between human, detector, and source text distributions. To address this, we propose DivScore, a zero-shot detection framework using normalized entropy-based scoring and domain knowledge distillation to robustly identify LLM-generated text in specialized domains. We also release a domain-specific benchmark for LLM-generated text detection in the medical and legal domains. Experiments on our benchmark show that DivScore consistently outperforms state-of-the-art detectors, with 14.4% higher AUROC and 64.0% higher recall (0.1% false positive rate threshold). In adversarial settings, DivScore demonstrates superior robustness than other baselines, achieving on average 22.8% advantage in AUROC and 29.5% in recall. Code and data are publicly available.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Jun-10-2025

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia (0.68)
- North America > United States (0.67)
- Asia > Middle East
  - UAE (0.28)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Law (1.00)
- Government (1.00)
- Energy (0.84)
- Health & Medicine
  - Health Care Providers & Services (0.68)
  - Therapeutic Area
    - Endocrinology > Diabetes (1.00)
    - Infections and Infectious Diseases (0.67)
    - Immunology (0.67)
    - Obstetrics/Gynecology (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Performance Analysis > Accuracy (1.00)
    - Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found