Goto

Collaborating Authors

DeepMind has simple tests that may prevent Musk's AI apocalypse

#artificialintelligence

You don't have to agree with Elon Musk's apocalyptic fears of artificial intelligence to be concerned that, in the rush to apply the technology in the real world, some algorithms could inadvertently cause harm.


DeepMind Has Simple Tests That Might Prevent Elon Musk's AI Apocalypse

#artificialintelligence

You don't have to agree with Elon Musk's apocalyptic fears of artificial intelligence to be concerned that, in the rush to apply the technology in the real world, some algorithms could inadvertently cause harm. This type of self-learning software powers Uber's self-driving cars, helps Facebook identify people in social-media posts, and let's Amazon's Alexa understand your questions. Now DeepMind, the London-based AI company owned by Alphabet Inc., has developed a simple test to check if these new algorithms are safe.


Parenting: Safe Reinforcement Learning from Human Input

arXiv.org Machine Learning

Autonomous agents trained via reinforcement learning present numerous safety concerns: reward hacking, negative side effects, and unsafe exploration, among others. In the context of near-future autonomous agents, operating in environments where humans understand the existing dangers, human involvement in the learning process has proved a promising approach to AI Safety. Here we demonstrate that a precise framework for learning from human input, loosely inspired by the way humans parent children, solves a broad class of safety problems in this context. We show that our PARENTING algorithm solves these problems in the relevant AI Safety gridworlds of Leike et al. (2017), that an agent can learn to outperform its parent as it "matures", and that policies learnt through PARENTING are generalisable to new environments.


Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention

arXiv.org Artificial Intelligence

Recent progress in AI and Reinforcement learning has shown great success in solving complex problems with high dimensional state spaces. However, most of these successes have been primarily in simulated environments where failure is of little or no consequence. Most real-world applications, however, require training solutions that are safe to operate as catastrophic failures are inadmissible especially when there is human interaction involved. Currently, Safe RL systems use human oversight during training and exploration in order to make sure the RL agent does not go into a catastrophic state. These methods require a large amount of human labor and it is very difficult to scale up. We present a hybrid method for reducing the human intervention time by combining model-based approaches and training a supervised learner to improve sample efficiency while also ensuring safety. We evaluate these methods on various grid-world environments using both standard and visual representations and show that our approach achieves better performance in terms of sample efficiency, number of catastrophic states reached as well as overall task performance compared to traditional model-free approaches


Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention

AAAI Conferences

Recent progress in AI and Reinforcement learning has shown great success in solving complex problems with high dimensional state spaces. However, most of these successes have been primarily in simulated environments where failure is of little or no consequence. Most real-world applications, however, require training solutions that are safe to operate as catastrophic failures are inadmissible especially when there is human interaction involved. Currently, Safe RL systems use human oversight during training and exploration in order to make sure the RL agent does not go into a catastrophic state. These methods require a large amount of human labor and it is very difficult to scale up. We present a hybrid method for reducing the human intervention time by combining model-based approaches and training a supervised learner to to improve sample efficiency while also ensuring safety. We evaluate these methods on various grid-world environments using both standard and visual representations and show that our approach achieves better performance in terms of sample efficiency, number of catastrophic states reached as well as overall task performance compared to traditional model-free approaches.