Who Said That? Benchmarking Social Media AI Detection

Cui, Wanyun, Zhang, Linqiu, Wang, Qianle, Cai, Shuyang

Oct-12-2023–arXiv.org Artificial Intelligence

AI-generated text has proliferated across various online platforms, offering both transformative prospects and posing significant risks related to misinformation and manipulation. It incorporates real AI-generate text from popular social media platforms like Zhihu and Quora. Unlike existing benchmarks, SAID deals with content that reflects the sophisticated strategies employed by real AI users on the Internet which may evade detection or gain visibility, providing a more realistic and challenging evaluation landscape. A notable finding of our study, based on the Zhihu dataset, reveals that annotators can distinguish between AI-generated and human-generated texts with an average accuracy rate of 96.5%. Furthermore, we present a new user-oriented AI-text detection challenge focusing on the practicality and effectiveness of identifying AI-generated text based on user information and multiple responses. The experimental results demonstrate that conducting detection tasks on actual social media platforms proves to be more challenging compared to traditional simulated AI-text detection, resulting in a decreased accuracy. On the other hand, user-oriented AI-generated text detection significantly improve the accuracy of detection. The advent of AI-generated text has had a profound impact on numerous sectors, including social media platforms. On one side, AI-generated responses enable automation, personalization, and scaling of content creation, thereby revolutionizing how information is disseminated and consumed. Addressing the abuse and malicious use of AI has led to significant research efforts in the field of AI-generated text detection. A range of approaches have been explored, including but not limited to machine learning algorithms Guo et al. (2023); Solaiman et al. (2019), text-based features analysis Mitchell et al. (2023); Tulchinskii et al. (2023); Mitchell et al. (2023), and positive unlabeled techniques Tian et al. (2023).

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Oct-12-2023

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.66)

Industry:
- Information Technology > Security & Privacy (0.67)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks (1.00)
    - Natural Language > Chatbot (1.00)
  - Communications > Social Media (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found