AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Online 3D Bin Packing with Constrained Deep Reinforcement Learning

Zhao, Hang, She, Qijin, Zhu, Chenyang, Yang, Yin, Xu, Kai

arXiv.org Machine LearningJun-26-2020

We solve a challenging yet practically useful variant of 3D Bin Packing Problem (3D-BPP). In our problem, the agent has limited information about the items to be packed into the bin, and an item must be packed immediately after its arrival without buffering or readjusting. The item's placement also subjects to the constraints of collision avoidance and physical stability. We formulate this online 3D-BPP as a constrained Markov decision process. To solve the problem, we propose an effective and easy-to-implement constrained deep reinforcement learning (DRL) method under the actor-critic framework. In particular, we introduce a feasibility predictor to predict the feasibility mask for the placement actions and use it to modulate the action probabilities output by the actor during training. Such supervisions and transformations to DRL facilitate the agent to learn feasible policies efficiently. Our method can also be generalized e.g., with the ability to handle lookahead or items with different orientations. We have conducted extensive evaluation showing that the learned policy significantly outperforms the state-of-the-art methods. A user study suggests that our method attains a human-level performance.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2006.14978

Country: North America > United States > New York > Suffolk County > Stony Brook (0.04)

Genre:

Research Report > Promising Solution (0.34)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Q-Learning with Differential Entropy of Q-Tables

Nguyen, Tung D., Kasmarik, Kathryn E., Abbass, Hussein A.

arXiv.org Machine LearningJun-26-2020

It is well-known that information loss can occur in the classic and simple Q-learning algorithm. Entropy-based policy search methods were introduced to replace Q-learning and to design algorithms that are more robust against information loss. We conjecture that the reduction in performance during prolonged training sessions of Q-learning is caused by a loss of information, which is non-transparent when only examining the cumulative reward without changing the Q-learning algorithm itself. We introduce Differential Entropy of Q-tables (DE-QT) as an external information loss detector to the Q-learning algorithm. The behaviour of DE-QT over training episodes is analyzed to find an appropriate stopping criterion during training. The results reveal that DE-QT can detect the most appropriate stopping point, where a balance between a high success rate and a high efficiency is met for classic Q-Learning algorithm.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2006.14795

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Oceania > Australia > New South Wales (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Robots Learning to Move like Animals

#artificialintelligenceJun-25-2020, 20:12:16 GMT

Whether it's a dog chasing after a ball, or a monkey swinging through the trees, animals can effortlessly perform an incredibly rich repertoire of agile locomotion skills. But designing controllers that enable legged robots to replicate these agile behaviors can be a very challenging task. The superior agility seen in animals, as compared to robots, might lead one to wonder: can we create more agile robotic controllers with less effort by directly imitating animals? In this work, we present a framework for learning robotic locomotion skills by imitating animals. Given a reference motion clip recorded from an animal (e.g. a dog), our framework uses reinforcement learning to train a control policy that enables a robot to imitate the motion in the real world.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.38)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.37)

Add feedback

Paraphrase Generation Using Deep Reinforcement Learning – Thought Leaders

#artificialintelligenceJun-25-2020, 18:15:05 GMT

When writing or talking we've all wondered whether there is a better way of communicating an idea to others. What words should I use? How should I structure the thought? How are they likely to respond? At Phrasee, we spend a lot of time thinking about language – what works and what doesn't.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

7 Best Courses to Learn Artificial Intelligence in 2020

#artificialintelligenceJun-25-2020, 14:46:16 GMT

This is another awesome course by Kirill Eremenko and his SuperDataScience Team on how to solve real-world business problems with AI. If you are business people or just curious how AI can help you then you should join this course. The complex topic of Artificial Intelligence and Machine Learning is presented the best it can without getting too technical. I highly recommend to business professionals trying to improve their skillset and help their business use AI. Talking about social proof, this course is trusted by more than 14,000 students and it has on average 4.3 rating which is amazing proof that this is a great course.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (1.00)
Education > Educational Setting > Online (0.83)
Education > Educational Technology > Educational Software > Computer Based Training (0.33)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.53)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.33)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Using Selective Attention in Reinforcement Learning Agents

#artificialintelligenceJun-25-2020, 07:47:54 GMT

Posted by Yujin Tang, Research Software Engineer and David Ha, Staff Research Scientist, Google Research, Tokyo Inattentional blindness ...

artificial intelligence, machine learning, reinforcement learning agent, (1 more...)

#artificialintelligence

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.24)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Noise, overestimation and exploration in Deep Reinforcement Learning

Stekolshchik, Rafael

arXiv.org Machine LearningJun-25-2020

We will discuss some statistical noise related phenomena, that were investigated by different authors in the framework of Deep Reinforcement Learning algorithms. The following algorithms are touched: DQN, Double DQN, DDPG, TD3, Hill-Climbing. Firstly, we consider overestimation, that is the harmful property resulting from noise. Then we deal with noise used for exploration, this is the useful noise. We discuss setting the noise parameter in TD3 for typical PyBullet environments associated with articulate bodies such as HopperBulletEnv and Walker2DBulletEnv. In the appendix, in relation with the Hill-Climbing algorithm, we will look at one more example of noise: adaptive noise.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

2006.14167

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

SOAC: The Soft Option Actor-Critic Architecture

Li, Chenghao, Ma, Xiaoteng, Zhang, Chongjie, Yang, Jun, Xia, Li, Zhao, Qianchuan

arXiv.org Artificial IntelligenceJun-25-2020

The option framework has shown great promise by automatically extracting temporally-extended sub-tasks from a long-horizon task. Methods have been proposed for concurrently learning low-level intra-option policies and high-level option selection policy. However, existing methods typically suffer from two major challenges: ineffective exploration and unstable updates. In this paper, we present a novel and stable off-policy approach that builds on the maximum entropy model to address these challenges. Our approach introduces an information-theoretical intrinsic reward for encouraging the identification of diverse and effective options. Meanwhile, we utilize a probability inference model to simplify the optimization problem as fitting optimal trajectories. Experimental results demonstrate that our approach significantly outperforms prior on-policy and off-policy methods in a range of Mujoco benchmark tasks while still providing benefits for transfer learning. In these tasks, our approach learns a diverse set of options, each of whose state-action space has strong coherence.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

arXiv.org Artificial Intelligence

2006.14363

Genre: Research Report (0.70)

Industry:

Education (0.68)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)

Add feedback

Machine Learning - Redcrix Technologies (P) Ltd.

#artificialintelligenceJun-24-2020, 14:48:20 GMT

Supervised machine learning algorithms can apply what has been learned in the past to new data using labeled examples to predict future events. Starting from the analysis of a known training dataset, the learning algorithm produces an inferred function to make predictions about the output values. The system is able to provide targets for any new input after sufficient training. The learning algorithm can also compare its output with the correct, intended output and find errors in order to modify the model accordingly. In contrast, unsupervised machine learning algorithms are used when the information used to train is neither classified nor labeled.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.37)

Add feedback

Learning to explore using active neural SLAM

AIHubJun-24-2020, 13:27:34 GMT

Advances in machine learning, computer vision and robotics have opened up avenues of building intelligent robots which can navigate in the physical world and perform complex tasks in our homes and offices. Exploration is a key challenge in building intelligent navigation agents. When an autonomous agent is dropped in an unseen environment, it needs to explore as much of the environment as fast as possible. How do we go about training autonomous exploration agents? One popular approach is using end-to-end deep Reinforcement Learning (RL).

artificial intelligence, machine learning, reinforcement learning, (13 more...)

AIHub

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.57)

Add feedback