Does GPT-3 Demonstrate Psychopathy? Evaluating Large Language Models from a Psychological Perspective

Li, Xingxuan, Li, Yutong, Joty, Shafiq, Liu, Linlin, Huang, Fei, Qiu, Lin, Bing, Lidong

May-8-2023–arXiv.org Artificial Intelligence

In this work, we determined whether large language models (LLMs) are psychologically safe. We designed unbiased prompts to systematically evaluate LLMs from a psychological perspective. First, we tested three different LLMs by using two personality tests: Short Dark Triad (SD-3) and Big Five Inventory (BFI). All models scored higher than the human average on SD-3, suggesting a relatively darker personality pattern. Despite being instruction fine-tuned with safety metrics to reduce toxicity, InstructGPT and FLAN-T5 still showed implicit dark personality patterns; both models scored higher than self-supervised GPT-3 on the Machiavellianism and narcissism traits on SD-3. Then, we evaluated the LLMs in the GPT-3 series by using well-being tests to study the impact of fine-tuning with more training data. We observed a continuous increase in the well-being scores of GPT-3 and InstructGPT. Following these observations, we showed that instruction fine-tuning FLAN-T5 with positive answers from BFI could effectively improve the model from a psychological perspective. On the basis of the findings, we recommended the application of more systematic and comprehensive psychological metrics to further evaluate and improve the safety of LLMs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

May-8-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States > New York
    - New York County > New York City (0.04)

Genre:
- Research Report > New Finding (0.93)

Industry:
- Education (0.93)
- Health & Medicine > Therapeutic Area
  - Psychiatry/Psychology > Personality Disorder > Antisocial Personality Disorder (0.52)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found