From Stability to Inconsistency: A Study of Moral Preferences in LLMs

Jotautaite, Monika, Phuong, Mary, Mangat, Chatrik Singh, Martinez, Maria Angelica

Apr-10-2025–arXiv.org Artificial Intelligence

As large language models (LLMs) increasingly integrate into our daily lives, it becomes crucial to understand their implicit biases and moral tendencies. To address this, we introduce a Moral Foundations LLM dataset (MFD-LLM) grounded in Moral Foundations Theory, which conceptualizes human morality through six core foundations. We propose a novel evaluation method that captures the full spectrum of LLMs' revealed moral preferences by answering a range of real-world moral dilemmas. Our findings reveal that state-of-the-art models have remarkably homogeneous value preferences, yet demonstrate a lack of consistency.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Apr-10-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.88)

Industry:
- Information Technology > Security & Privacy (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found