From Stability to Inconsistency: A Study of Moral Preferences in LLMs
Jotautaite, Monika, Phuong, Mary, Mangat, Chatrik Singh, Martinez, Maria Angelica
–arXiv.org Artificial Intelligence
As large language models (LLMs) increasingly integrate into our daily lives, it becomes crucial to understand their implicit biases and moral tendencies. To address this, we introduce a Moral Foundations LLM dataset (MFD-LLM) grounded in Moral Foundations Theory, which conceptualizes human morality through six core foundations. We propose a novel evaluation method that captures the full spectrum of LLMs' revealed moral preferences by answering a range of real-world moral dilemmas. Our findings reveal that state-of-the-art models have remarkably homogeneous value preferences, yet demonstrate a lack of consistency.
arXiv.org Artificial Intelligence
Apr-10-2025
- Country:
- Asia > Middle East > Yemen > Amran Governorate > Amran (0.04)
- Genre:
- Research Report > New Finding (0.88)
- Industry:
- Information Technology > Security & Privacy (0.93)
- Technology: