When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

Jan-18-2025, 16:07:57 GMT–Neural Information Processing Systems

AI systems are becoming increasingly intertwined with human life. In order to effectively collaborate with humans and ensure safety, AI systems need to be able to understand, interpret and predict human moral judgments and decisions. Human moral judgments are often guided by rules, but not always. A central challenge for AI safety is capturing the flexibility of the human moral mind -- the ability to determine when a rule should be broken, especially in novel or unusual situations. In this paper, we present a novel challenge set consisting of moral exception question answering (MoralExceptQA) of cases that involve potentially permissible moral exceptions – inspired by recent moral psychology studies.

exploring language model, human moral judgment, make exception, (9 more...)

Neural Information Processing Systems

Jan-18-2025, 16:07:57 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language (1.00)