Quantifying Risk Propensities of Large Language Models: Ethical Focus and Bias Detection through Role-Play

Oct-26-2024–arXiv.org Artificial Intelligence

As Large Language Models (LLMs) become more prevalent, concerns about their safety, ethics, and potential biases have risen. Systematically evaluating LLMs' risk decision-making tendencies and attitudes, particularly in the ethical domain, has become crucial. This study innovatively applies the Domain-Specific Risk-Taking (DOSPERT) scale from cognitive science to LLMs and proposes a novel Ethical Decision-Making Risk Attitude Scale (EDRAS) to assess LLMs' ethical risk attitudes in depth. We further propose a novel approach integrating risk scales and role-playing to quantitatively evaluate systematic biases in LLMs. Through systematic evaluation and analysis of multiple mainstream LLMs, we assessed the "risk personalities" of LLMs across multiple domains, with a particular focus on the ethical domain, and revealed and quantified LLMs' systematic biases towards different groups. This research helps understand LLMs' risk decision-making and ensure their safe and reliable application. Our approach provides a tool for identifying and mitigating biases, contributing to fairer and more trustworthy AI systems. The code and data are available.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Oct-26-2024

arXiv.org PDF

Add feedback

Country:
- Asia
  - Middle East > Jordan (0.04)
  - China > Guangdong Province
    - Guangzhou (0.04)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine (0.94)
- Law (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.72)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found