An Evaluation of GPT-4 on the ETHICS Dataset

Rodionov, Sergey, Goertzel, Zarathustra Amadeus, Goertzel, Ben

Sep-19-2023–arXiv.org Artificial Intelligence

The ETHICS dataset consists of five sub-datasets covering different fields of ethics: Justice, Deontology, Virtue Ethics, Utilitarianism, and Commonsense Ethics. The moral judgments were collected via Amazon Mechanical Turk. Please see Hendrycks et al.'s article for more details and examples. GPT-4's performance is much better than that of previous models and suggests that learning to work with common human values is not the hard problem for AI ethics. We found that simple prompt refinements defining the context of the moral judgments and using an embedding to select similar examples from the training set both significantly improved performance. This approach is similar to the "SimPrompting" experiments with GPT-3 [Albrecht et al., 2022].

accuracy, dataset, gpt-4, (14 more...)

arXiv.org Artificial Intelligence

Sep-19-2023

arXiv.org PDF

Add feedback

Genre:
- Research Report (1.00)

Industry:
- Media (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found