Value alignment: a formal approach

Sierra, Carles, Osman, Nardine, Noriega, Pablo, Sabater-Mir, Jordi, Perelló, Antoni

Oct-18-2021–arXiv.org Artificial Intelligence

Value alignment in AI has emerged as one of the basic principles that should govern autonomous AI systems. It essentially states that a system's goals and behaviour should be aligned with human values. But how to ensure value alignment? In this paper we first provide a formal model to represent values through preferences and ways to compute value aggregations; i.e. preferences with respect to a group of agents and/or preferences with respect to sets of values. Value alignment is then defined, and computed, for a given norm with respect to a given value through the increase/decrease that it results in the preferences of future states of the world. We focus on norms as it is norms that govern behaviour, and as such, the alignment of a given system with a given value will be dictated by the norms the system follows.

agent, alignment, transition, (16 more...)

arXiv.org Artificial Intelligence

Oct-18-2021

arXiv.org PDF

Add feedback

Country:
- Europe
  - Sweden > Stockholm
    - Stockholm (0.04)
  - Spain
    - Catalonia (0.04)
    - Valencian Community > Valencia Province
      - Valencia (0.04)
  - Netherlands > South Holland
    - The Hague (0.04)
    - Dordrecht (0.04)
- Asia > Singapore
  - Central Region > Singapore (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)