Collaborating Authors

OpenAI Open Sources Safety Gym to Improve Safety in Reinforcement Learning Agents


Safety is one of the emerging concerns in deep learning systems. In the context of deep learning systems, safety is related to building agents that respect safety dynamics in a given environment. In many cases such as supervised learning, safety is modeled as part of the training datasets. However, other methods such as reinforcement learning require agents to master the dynamics of the environments by experimenting with it which introduces its own set of safety concerns. To address some of these challenges, OpenAI has recently open sourced Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training.

OpenAI releases Safety Gym for reinforcement learning


While much work in data science to date has focused on algorithmic scale and sophistication, safety -- that is, safeguards against harm -- is a domain no less worth pursuing. This is particularly true in applications like self-driving vehicles, where a machine learning system's poor judgement might contribute to an accident. That's why firms like Intel's Mobileye and Nvidia have proposed frameworks to guarantee safe and logical decision-making, and it's why OpenAI -- the San Francisco-based research firm cofounded by CTO Greg Brockman, chief scientist Ilya Sutskever, and others -- today released Safety Gym. OpenAI describes it as a suite of tools for developing AI that respects safety constraints while training, and for comparing the "safety" of algorithms and the extent to which those algorithms avoid mistakes while learning. Safety Gym is designed for reinforcement learning agents, or AI that's progressively spurred toward goals via rewards (or punishments).

Taking Machine Learning to the Next Level


Ethics are an Issue Don't kid yourself--introducing self-learning robots that can learn faster and better than humans will come with a huge range of issues. On our end, we can only program them to the extent of our human knowledge, which is always going to be limited. If we forget to set system safeties, we could have serious trouble on our hands in terms of public safety. On the other end, the question remains: do we really want to create a world of computers that think--and do--via their own free will, especially when they are smarter than humans? That's definitely an issue we need to reflect on before jumping too far into the reinforcement learning landscape.


AAAI Conferences

Learning from demonstration is a popular method for teaching robots new skills. However, little work has looked at how to measure safety in the context of learning from demonstrations. We discuss three different types of safety problems that are important for robot learning from human demonstrations: (1) using demonstrations to evaluate the safety of a robot's current policy, (2) using demonstrations to enable risk-aware policy improvement, and (3) determining when the demonstrations received by the robot are sufficient to ensure a desired safety level. We propose a risk-aware Bayesian sampling approach based on inverse reinforcement learning that provides a first step towards addressing these problems. We demonstrate the validity of our approach on a simulated navigation task and discuss promising areas for future work.

The role of machine learning in construction safety – AI.Business


The construction industry is one of the biggest in the USA but also is one of the deadliest. Construction has been and continues to be a dangerous occupation, resulting in many accidents, injuries and fatalities. Safe practices are crucial for the industry. Developments of methodologies for machine learning use in construction safety and the preparation of appropriate software will fill a large need in the industry. Analysis of construction equipment activities with a proper level of detail can help improve several aspects of construction engineering and management such as productivity assessment, safety management, idle time reduction, emission monitoring and control etc.