Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning

Mhamdi, El Mahdi El, Guerraoui, Rachid, Hendrikx, Hadrien, Maurer, Alexandre

Feb-14-2020, 04:56:58 GMT–Neural Information Processing Systems

In reinforcement learning, agents learn by performing actions and observing their outcomes. Sometimes, it is desirable for a human operator to interrupt an agent in order to prevent dangerous situations from happening. Yet, as part of their learning process, agents may link these interruptions, that impact their reward, to specific states and deliberately avoid them. The situation is particularly challenging in a multi-agent context because agents might not only learn from their own past interruptions, but also from those of other agents. Orseau and Armstrong defined safe interruptibility for one learner, but their work does not naturally extend to multi-agent systems.

artificial intelligence, machine learning, reinforcement learning, (6 more...)

Neural Information Processing Systems

Feb-14-2020, 04:56:58 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Representation & Reasoning > Agents (1.00)