Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes

Tuyen, Le Pham, Vien, Ngo Anh, Layek, Abu, Chung, TaeChoong

May-11-2018–arXiv.org Artificial Intelligence

Abstract--In recent years, reinforcement learning has achieved many remarkable successes due to the growing adoption of deep learning techniques and the rapid growth in computing power . Nevertheless, it is well-known that flat reinforcement learning algorithms are often not able to learn well and data-efficient in tasks having hierarchical structures, e.g. Hierarchical reinforcement learning is a principled approach that is able to tackle these challenging tasks. On the other hand, many real-world tasks usually have only partial observability in which state measurements are often imperfect and partially observable. The problems of RL in such settings can be formulated as a partially observable Markov decision process (POMDP). In this paper, we study hierarchical RL in POMDP in which the tasks have only partial observability and possess hierarchical properties. We propose a hierarchical deep reinforcement learning approach for learning in hierarchical POMDP . The deep hierarchical RL algorithm is proposed to apply to both MDP and POMDP learning. We evaluate the proposed algorithm on various challenging hierarchical POMDP . Reinforcement Learning (RL) [1] is a subfield of machine learning focused on learning a policy in order to maximize total cumulative reward in an unknown environment. RL is divided into two approaches: value-based approach and policy-based approach [15]. A typical value-based approach tries to obtain an optimal policy by finding optimal value functions. The value functions are updated using the immediate reward and the discounted value of the next state. Some methods based on this approach are Q-learning, SARSA, and TD-learning [1]. In contrast, the policy-based approach directly learns a parameterized policy that maximizes the cumulative discounted reward.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

May-11-2018

arXiv.org PDF

Add feedback

Country:
- North America > United States > California (0.28)

Genre:
- Research Report (0.50)

Industry:
- Leisure & Entertainment > Games (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found