LLM Detectors Still Fall Short of Real World: Case of LLM-Generated Short News-Like Posts
Gameiro, Henrique Da Silva, Kucharavy, Andrei, Dolamic, Ljiljana
–arXiv.org Artificial Intelligence
With the emergence of widely available powerful LLMs, disinformation generated by large Language Models (LLMs) has become a major concern. Historically, LLM detectors have been touted as a solution, but their effectiveness in the real world is still to be proven. In this paper, we focus on an important setting in information operations -- short news-like posts generated by moderately sophisticated attackers. We demonstrate that existing LLM detectors, whether zero-shot or purpose-trained, are not ready for real-world use in that setting. All tested zero-shot detectors perform inconsistently with prior benchmarks and are highly vulnerable to sampling temperature increase, a trivial attack absent from recent benchmarks. A purpose-trained detector generalizing across LLMs and unseen attacks can be developed, but it fails to generalize to new human-written texts. We argue that the former indicates domain-specific benchmarking is needed, while the latter suggests a trade-off between the adversarial evasion resilience and overfitting to the reference human text, with both needing evaluation in benchmarks and currently absent. We believe this suggests a re-consideration of current LLM detector benchmarking approaches and provides a dynamically extensible benchmark to allow it (https://github.com/Reliable-Information-Lab-HEVS/dynamic_llm_detector_benchmark).
arXiv.org Artificial Intelligence
Sep-5-2024
- Country:
- Asia
- Middle East
- Iraq (0.14)
- Palestine > Gaza Strip
- Gaza Governorate > Gaza (0.04)
- Khan Yunis Governorate > Khan Yunis (0.04)
- Singapore (0.04)
- Middle East
- Europe
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Romania > Sud - Muntenia Development Region
- Giurgiu County > Giurgiu (0.04)
- Switzerland > Vaud
- Lausanne (0.04)
- Ukraine (0.04)
- Middle East > Malta
- North America
- Asia
- Genre:
- Research Report (0.82)
- Industry:
- Government
- Information Technology (1.00)
- Media > News (1.00)
- Technology: