PolicyBoost: Functional Policy Gradient with Ranking-based Reward Objective

Yu, Yang (Nanjing University) | Da, Qing (Nanjing University)

Jul-22-2014–AAAI Conferences

Learning policies in nonlinear representations is an important step toward real-world applications of reinforcement learning in robotics. While functional representation has been widely applied in state-of-the-art supervised learning techniques (as known as boosting approaches) to adaptively learn nonlinear functions, in reinforcement learning the boosting-style approaches have been little investigated. Only a few pieces of work explored in this direction, which however may suffer from the occurring-probability-pursuing problem. In this paper, to alleviate the problem, we propose to employ a ranking-based objective function to guide the policy search in a function space, resulting in the PolicyBoost approach. Experiment results verify the effectiveness as well as the robustness of the PolicyBoost.

artificial intelligence, machine learning, reinforcement learning, (4 more...)

AAAI Conferences

Jul-22-2014

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Inductive Learning (0.53)
  - Reinforcement Learning (0.44)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found