Reviews: Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning
–Neural Information Processing Systems
This paper presents an extension of the safe interruptibility (SInt) framework to the multi-agent case. The authors argue that the original definition of safe interruptibility is difficult to use in this case and give a more constrained/informed one called'dynamic safe interruptibility' (DSInt) based on whether the update rule depends on the interruption probability. The joint action case is considered first and it is shown that DSInt can be achieved. The case of independent learners is then considered, with a first result showing that independent Q-learners do not satisfy the conditions of the definition of DSInt. The authors finally propose a model where the agents are aware of each others interruptions, and interrupted observations are pruned from the sequence, and claim that this model verify the definition of DSInt.
Neural Information Processing Systems
Oct-8-2024, 02:07:42 GMT