The Greatest Good Benchmark: Measuring LLMs' Alignment with Utilitarian Moral Dilemmas

Marraffini, Giovanni Franco Gabriel, Cotton, Andrés, Hsueh, Noe Fabian, Fridman, Axel, Wisznia, Juan, Del Corro, Luciano

Mar-25-2025–arXiv.org Artificial Intelligence

The question of how to make decisions that maximise the well-being of all persons is very relevant to design language models that are beneficial to humanity and free from harm. We introduce the Greatest Good Benchmark to evaluate the moral judgments of LLMs using utilitarian dilemmas. Our analysis across 15 diverse LLMs reveals consistently encoded moral preferences that diverge from established moral theories and lay population moral standards. Most LLMs have a marked preference for impartial beneficence and rejection of instrumental harm. These findings showcase the 'artificial moral compass' of LLMs, offering insights into their moral alignment.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Mar-25-2025

arXiv.org PDF

Add feedback

Country:
- Africa > Middle East (0.04)
- South America > Argentina
  - Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
- North America > Canada
  - Ontario > Toronto (0.04)
- Europe
  - Middle East > Malta
    - Eastern Region > Northern Harbour District > St. Julian's (0.04)
  - Latvia > Lubāna Municipality
    - Lubāna (0.04)
- Asia
  - Singapore (0.04)
  - Middle East (0.04)

Genre:
- Questionnaire & Opinion Survey (1.00)
- Research Report
  - New Finding (1.00)
  - Experimental Study (0.70)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.70)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found