Goto

Collaborating Authors

Can Reinforcement Learning help Robots become Intelligent?

#artificialintelligence

We know that robots today can accomplish a multitude of tasks like assembling parts, picking farm produce, doing a quick scan of surroundings, and greeting people at malls. But can they learn by themselves like primates? Scientists argue that since robotics is slowly approaching its peak stage, it will be hugely beneficial and exciting if the robots could learn on their own, from interactions with the physical and social environment. While AI and machine learning are doing their part in augmenting robotics, implementation is not simple as most robots have a limited learning capacity. Through reinforcement learning (RL) is purported to be the simplest way to train robots, much work needs to be done.


This robot taught itself to walk entirely on its own

#artificialintelligence

Within 10 minutes of its birth, a baby fawn is able to stand. Within seven hours, it is able to walk. Between those two milestones, it engages in a highly adorable, highly frenetic flailing of limbs to figure it all out.


Py, Robot: Python And Reinforcement Learning

#artificialintelligence

Reinforcement learning is all about making robots adapt and learn about their environment on their own given only a simple reward function. It allows computers to learn how to excel at Atari and Pacman games and how to walk like we humans do. This article provides a well written implementation of Reinforcement Learning through Q learning, one of the most popular reinforcement learning methods in Python.


Dimensionality Reduced Reinforcement Learning for Assistive Robots

AAAI Conferences

State-of-the-art personal robots need to perform complex manipulation tasks to be viable in assistive scenarios. However, many of these robots, like the PR2, use manipulators with high degrees-of-freedom, and the problem is made worse in bimanual manipulation tasks. The complexity of these robots lead to large dimensional state spaces, which are difficult to learn in. We reduce the state space by using demonstrations to discover a representative low-dimensional hyperplane in which to learn. This allows the agent to converge quickly to a good policy. We call this Dimensionality Reduced Reinforcement Learning (DRRL). However, when performing dimensionality reduction, not all dimensions can be fully represented. We extend this work by first learning in a single dimension, and then transferring that knowledge to a higher-dimensional hyperplane. By using our Iterative DRRL (IDRRL) framework with an existing learning algorithm, the agent converges quickly to a better policy by iterating to increasingly higher dimensions. IDRRL is robust to demonstration quality and can learn efficiently using few demonstrations. We show that adding IDRRL to the Q-Learning algorithm leads to faster learning on a set of mountain car tasks and the robot swimmers problem.


Hybrid Reinforcement Learning and Its Application to Biped Robot Control

Neural Information Processing Systems

Advanced Technology R&D Center Mitsubishi Electric Corporation Amagasaki, Hyogo 661-0001, Japan Abstract A learning system composed of linear control modules, reinforcement learningmodules and selection modules (a hybrid reinforcement learning system) is proposed for the fast learning of real-world control problems. The selection modules choose one appropriate control module dependent on the state. It learned the control on a sloped floor more quickly than the usual reinforcement learningbecause it did not need to learn the control on a flat floor, where the linear control module can control the robot. When it was trained by a 2-step learning (during the first learning step, the selection module was trained by a training procedure controlled onlyby the linear controller), it learned the control more quickly. The average number of trials (about 50) is so small that the learning system is applicable to real robot control. 1 Introduction Reinforcement learning has the ability to solve general control problems because it learns behavior through trial-and-error interactions with a dynamic environment.