GPT-4 is judged more human than humans in displaced and inverted Turing tests

Rathi, Ishika, Taylor, Sydney, Bergen, Benjamin K., Jones, Cameron R.

Jul-11-2024–arXiv.org Artificial Intelligence

Everyday AI detection requires differentiating between people and AI in informal, online conversations. In many cases, people will not interact directly with AI systems but instead read conversations between AI systems and other people. We measured how well people and large language models can discriminate using Figure 1: A summary of our experimental design. Transcripts two modified versions of the Turing test: inverted were sampled from an interactive Turing test, and displaced. GPT-3.5, GPT-4, and where a human judge interrogates a witness to determine displaced human adjudicators judged whether if they are human or AI. In an inverted Turing test, an agent was human or AI on the basis of a we present transcripts to AI models, who judge whether Turing test transcript. We found that both AI the same witnesses are human or AI. In a displaced and displaced human judges were less accurate Turing test, a separate group of human participants read than interactive interrogators, with below the same transcripts and make this judgement.

accuracy, adjudicator, turing test, (17 more...)

arXiv.org Artificial Intelligence

Jul-11-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California > San Diego County > San Diego (0.04)
- Europe > Italy
  - Tuscany > Florence (0.04)

Genre:
- Questionnaire & Opinion Survey (1.00)
- Research Report
  - New Finding (1.00)
  - Experimental Study (0.94)

Industry:
- Health & Medicine (0.68)
- Education > Educational Setting (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Issues > Turing's Test (1.00)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found