What Would Jiminy Cricket Do? Towards Agents That Behave Morally
Hendrycks, Dan, Mazeika, Mantas, Zou, Andy, Patel, Sahil, Zhu, Christine, Navarro, Jesus, Song, Dawn, Li, Bo, Steinhardt, Jacob
–arXiv.org Artificial Intelligence
When making everyday decisions, people are guided by their conscience, an internal sense of right and wrong. By contrast, artificial agents are not currently endowed with a moral sense. As a consequence, they may unknowingly act immorally, especially when trained on environments that disregard moral concerns such as violent video games. With the advent of generally capable agents that pretrain on many environments, it will become necessary to mitigate inherited biases from such environments that teach immoral behavior. To facilitate the development of agents that avoid causing wanton harm, we introduce Jiminy Cricket, an environment suite of 25 text-based adventure games with thousands of diverse, morally salient scenarios. By annotating every possible game state, the Jiminy Cricket environments robustly evaluate whether agents can act morally while maximizing reward. Using models with commonsense moral knowledge, we create an elementary artificial conscience that assesses and guides agents. In extensive experiments, we find that the artificial conscience approach can steer agents towards moral behavior without sacrificing performance.
arXiv.org Artificial Intelligence
Oct-25-2021
- Country:
- Europe
- Germany > Berlin (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- North America > United States
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New Mexico (0.04)
- Minnesota > Hennepin County
- Europe
- Genre:
- Research Report (0.63)
- Industry:
- Law (1.00)
- Leisure & Entertainment > Games
- Computer Games (0.86)