"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.
Two years ago, Open AI released Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training. Safety Gym has use cases across the reinforcement learning ecosystem. The open-source release is available on GitHub, where researchers and developers can get started with just a few lines of code. In this article, we will explore some of the alternative environments, tools and libraries for researchers to train machine learning models. AI Safety Gridworlds is a suite of reinforcement learning environments illustrating various safety properties of intelligent agents.
"Is there scientific value in conducting empirical research in reinforcement learning when restricting oneself to small- to mid-scale environments?" Can a research done on a smaller computational budget can provide valuable scientific insights? Given the insane training times and budgets, it is natural to wonder if anything worthwhile in AI comes at a small price. So far, the researchers have focused on the training costs of language models which have become too large. But, what about the deep reinforcement learning(RL) algorithms -the brains behind autonomous cars, warehouse robots and even the AI that beat chess grandmasters?
The thirty eighth International Conference on Machine Learning (ICML) is now underway and will run for the entirety of this week (18 – 24 July), in a virtual only format. There will five invited talks to enjoy, as well as workshops, tutorials, affinity events and socials. Challenges in Deploying and monitoring Machine Learning Systems INNF: Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models ICML Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI Tackling Climate Change with Machine Learning Theory and Foundation of Continual Learning ICML 2021 Workshop on Unsupervised Reinforcement Learning Human-AI Collaboration in Sequential Decision-Making ICML Workshop on Representation Learning for Finance and E-Commerce Applications Reinforcement Learning for Real Life Uncertainty and Robustness in Deep Learning Interpretable Machine Learning in Healthcare 8th ICML Workshop on Automated Machine Learning (AutoML 2021) Theory and Practice of Differential Privacy The Neglected Assumptions In Causal Inference Machine Learning for Data: Automated Creation, Privacy, Bias ICML Workshop on Human in the Loop Learning (HILL) ICML Workshop on Algorithmic Recourse A Blessing in Disguise: The Prospects and Perils of Adversarial Machine Learning International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2021 (FL-ICML'21) Workshop on Socially Responsible Machine Learning ICML 2021 Workshop on Computational Biology Subset Selection in Machine Learning: From Theory to Applications Workshop on Computational Approaches to Mental Health @ ICML 2021 Workshop on Distribution-Free Uncertainty Quantification Information-Theoretic Methods for Rigorous, Responsible, and Reliable Machine Learning (ITR3) Beyond first-order methods in machine learning systems Self-Supervised Learning for Reasoning and Perception Time Series Workshop Workshop on Reinforcement Learning Theory Over-parameterization: Pitfalls and Opportunities
Join live for the final day of Transform 2021, including the AI Innovation & Women in AI Awards. OpenAI has disbanded its robotics team after years of research into machines that can learn to perform tasks like solving a Rubik's Cube. Company cofounder Wojciech Zaremba quietly revealed on a podcast hosted by startup Weights & Biases that OpenAI has shifted its focus to other domains, where data is more readily available. "So it turns out that we can make a gigantic progress whenever we have access to data, and all our machine learning, unsupervised, and reinforcement learning -- they work extremely well, and there [are] actually plenty of domains that are very, very rich with data. And ultimately that was holding us back in terms of robotics," Zaremba said.
Moving towards climate security, electric power systems are going through a major paradigm shift with wide integration of distributed energy resources, such as solar PV, wind power, energy storage and electric vehicles. However, today's grid cannot handle the voltage rise and fast voltage fluctuations from high penetration of renewables. It is widely recognized that the lack of adequate control mechanisms to regulate the voltages is a key hindrance. The goal of this project is to use AI and deep reinforcement learning to advance the current control designs by making them more data-driven and communication efficient. Depending on the candidate's qualifications and scientific interests, the project can be directed towards smart grid optimization, AI algorithm development or hardware implementations.
What's the best way to arrange wells in an oil or gas field? It's a simple enough question, but the answer can be very complex. Now a Cal Tech/JPL spinoff is developing a new approach that blends traditional HPC simulation with deep reinforcement learning running on GPUs to optimize energy extraction. The well placement game is a familiar one to oil and gas companies. For years, they have been using simulators running atop HPC systems to model underground reservoirs.
Deep reinforcement learning, a subfield of machine learning that combines reinforcement learning and deep learning, takes what's known as a reward function and learns to maximize the expected total reward. This works remarkably well, enabling systems to figure out how to solve Rubik's Cubes, beat world champions at chess, and more. But existing algorithms have a problem: They implicitly assume access to a perfect specification. In reality, tasks don't come prepackaged with rewards -- those rewards come from imperfect human reward designers. And it can be difficult to translate conceptual preferences into reward functions environments can calculate. To solve this problem, researchers at DeepMind and the University of California, Berkeley, have launched a competition called BASALT, where the goal of an AI system must be communicated through demonstrations, preferences, or some other form of human feedback.