ethical concept
EALM: Introducing Multidimensional Ethical Alignment in Conversational Information Retrieval
Yu, Yiyao, Wang, Junjie, Zhang, Yuxiang, Zhang, Lin, Yang, Yujiu, Sakai, Tetsuya
Artificial intelligence (AI) technologies should adhere to human norms to better serve our society and avoid disseminating harmful or misleading information, particularly in Conversational Information Retrieval (CIR). Previous work, including approaches and datasets, has not always been successful or sufficiently robust in taking human norms into consideration. To this end, we introduce a workflow that integrates ethical alignment, with an initial ethical judgment stage for efficient data screening. To address the need for ethical judgment in CIR, we present the QA-ETHICS dataset, adapted from the ETHICS benchmark, which serves as an evaluation tool by unifying scenarios and label meanings. However, each scenario only considers one ethical concept. Therefore, we introduce the MP-ETHICS dataset to evaluate a scenario under multiple ethical concepts, such as justice and Deontology. In addition, we suggest a new approach that achieves top performance in both binary and multi-label ethical judgment tasks. Our research provides a practical method for introducing ethical alignment into the CIR workflow. The data and code are available at https://github.com/wanng-ide/ealm .
What Does It Mean to Align AI With Human Values?
Many years ago, I learned to program on an old Symbolics Lisp Machine. The operating system had a built-in command spelled "DWIM," short for "Do What I Mean." If I typed a command and got an error, I could type "DWIM," and the machine would try to figure out what I meant to do. A surprising fraction of the time, it actually worked. The DWIM command was a microcosm of the more modern problem of "AI alignment": We humans are prone to giving machines ambiguous or mistaken instructions, and we want them to do what we mean, not necessarily what we say.