Reinforcement Learning

Designing societally beneficial Reinforcement Learning (RL) systems


Deep reinforcement learning (DRL) is transitioning from a research field focused on game playing to a technology with real-world applications. Notable examples include DeepMind's work on controlling a nuclear reactor or on improving Youtube video compression, or Tesla attempting to use a method inspired by MuZero for autonomous vehicle behavior planning. But the exciting potential for real world applications of RL should also come with a healthy dose of caution – for example RL policies are well known to be vulnerable to exploitation, and methods for safe and robust policy development are an active area of research. At the same time as the emergence of powerful RL systems in the real world, the public and researchers are expressing an increased appetite for fair, aligned, and safe machine learning systems. The focus of these research efforts to date has been to account for shortcomings of datasets or supervised learning practices that can harm individuals.

Artificiel Inteligence Free Structure Metasurface Optimization


Metasurface refers to a nano-optical device that achieves unprecedented properties of light using a structure much smaller than the wavelength of light. Nano-optical devices control the characteristics of light at the micro level, and can be used for LiDAR beam steering devices used for autonomous driving, ultra-high-resolution imaging technology, optical properties control of light emitting devices used in displays, and hologram generation. . Recently, as the expected performance of a nano-optical device increases, interest in optimizing a device having a free structure in order to achieve a performance far exceeding that of the device structure in the past is increasing. This is the first case of solving a problem with a large design space such as a free structure by applying reinforcement learning.

Deep reinforcement learning for self-tuning laser source of dissipative solitons - Scientific Reports


Increasing complexity of modern laser systems, mostly originated from the nonlinear dynamics of radiation, makes control of their operation more and more challenging, calling for development of new approaches in laser engineering. Machine learning methods, providing proven tools for identification, control, and data analytics of various complex systems, have been recently applied to mode-locked fiber lasers with the special focus on three key areas: self-starting, system optimization and characterization. However, the development of the machine learning algorithms for a particular laser system, while being an interesting research problem, is a demanding task requiring arduous efforts and tuning a large number of hyper-parameters in the laboratory arrangements. It is not obvious that this learning can be smoothly transferred to systems that differ from the specific laser used for the algorithm development by design or by varying environmental parameters. Here we demonstrate that a deep reinforcement learning (DRL) approach, based on trials and errors and sequential decisions, can be successfully used for control of the generation of dissipative solitons in mode-locked fiber laser system. We have shown the capability of deep Q-learning algorithm to generalize knowledge about the laser system in order to find conditions for stable pulse generation. Region of stable generation was transformed by changing the pumping power of the laser cavity, while tunable spectral filter was used as a control tool. Deep Q-learning algorithm is suited to learn the trajectory of adjusting spectral filter parameters to stable pulsed regime relying on the state of output radiation. Our results confirm the potential of deep reinforcement learning algorithm to control a nonlinear laser system with a feed-back. We also demonstrate that fiber mode-locked laser systems generating data at high speed present a fruitful photonic test-beds for various machine learning concepts based on large datasets.

Learning Locomotion Skills Safely in the Real World


Posted by Jimmy (Tsung-Yen) Yang, Student Researcher, Robotics at Google The promise of deep reinforcement learning (RL) in solving comp...

Machine learning program for games inspires development of groundbreaking scientific tool


We learn new skills by repetition and reinforcement learning. Through trial and error, we repeat actions leading to good outcomes, try to avoid bad outcomes and seek to improve those in between. Researchers are now designing algorithms based on a form of artificial intelligence that uses reinforcement learning. They are applying them to automate chemical synthesis, drug discovery and even play games like chess and Go. Scientists at the U.S. Department of Energy's (DOE) Argonne National Laboratory have developed a reinforcement learning algorithm for yet another application.

Offline RL made easier: no TD learning, advantage reweighting, or transformers


A demonstration of the RvS policy we learn with just supervised learning and a depth-two MLP. It uses no TD learning, advantage reweighting, or Transformers! Offline reinforcement learning (RL) is conventionally approached using value-based methods based on temporal difference (TD) learning. These algorithms learn conditional policies by conditioning on goal states (Lynch et al., 2019; Ghosh et al., 2021), reward-to-go (Kumar et al., 2019; Chen et al., 2021), or language descriptions of the task (Lynch and Sermanet, 2021). We find the simplicity of these methods quite appealing.

IBM's AutoAI Has The Smarts To Make Data Scientists A Lot More Productive – But What's Scary Is That It's Getting A Whole Lot Smarter


I recently had the opportunity to discuss current IBM artificial intelligence developments with Dr. Lisa Amini, an IBM Distinguished Engineer and the Director of IBM Research Cambridge, home to the MIT-IBM Watson AI Lab. Dr. Amini was previously Director of Knowledge & Reasoning Research in the Cognitive Computing group at IBM's TJ Watson Research Center in New York. Dr. Amini earned her Ph.D. degree in Computer Science from Columbia University. Dr. Amini and her team are part of IBM Research tasked with creating the next generation of Automated AI and data science. I was interested in automation's impact on the lifecycles of artificial intelligence and machine learning and centered our discussion around next-generation capabilities for AutoAI. AutoAI automates the highly complex process of finding and optimizing the best ML model, features, and model hyperparameters for your data.

Deep Reinforcement Learning for Solving Rubik's Cube


The Rubik's Cube is a famous 3-D puzzle toy. A regular Rubik's Cube has six faces, each of which has nine coloured stickers, and the puzzle is solved when each face has a united colour. If we count one quarter (90) turn as one move and two quarter turns (a "face" turn) as two moves, the best algorithms human-invented can solve any instance of the cube in 26 moves. My target is to let the computer learn how to solve the Rubik's Cube without feeding it any human knowledge like the symmetry of the cube. The most challenging part is the Rubik's Cube has 43,252,003,274,489,856,000 possible permutations.

A newcomer's guide to #ICRA2022: Tutorials


I believe that one of the best ways to get the training you need for a job market in robotics is to attend tutorials at conferences like ICRA. Unlike workshops where you might listen to some work-in-progress, other workshop paper presentations and panel discussions, tutorials are exactly what they sound like. They aim to give you some hands-on learning sessions on technical tools/skills with specific learning objectives. As such, most tutorials would expect you to come prepared to actively participate and follow along. For instance, the "Tools for Robotic Reinforcement Learning" tutorial expects you to come knowing how to code in python and have basic knowledge of reinforcement learning because you'll be expected to use those skills/knowledge in the hands-on sessions. There are seven tutorials this year.

Hands-on reinforcement learning course -- part 1


Let's walk this beautiful path from the fundamentals to cutting edge reinforcement learning (RL), step-by-step, with coding examples and tutorials in Python, together! This first part covers the bare minimum concept and theory you need to embark on this journey. Then, in each following chapter, we will solve a different problem, with increasing difficulty. Ultimately, the most complex RL problems involve a mixture of reinforcement learning algorithms, optimization, and Deep Learning. You do not need to know deep learning (DL) to follow along with this course.