Sequential Classification-Based Optimization for Direct Policy Search

Hu, Yi-Qi (Nanjing University) | Qian, Hong (Nanjing University) | Yu, Yang (Nanjing University)

Feb-14-2017–AAAI Conferences

Classification-based optimization is a recently developed framework for derivative-free optimization, which has shown to be effective for non-convex optimization problems with many local optima. This framework requires to sample a batch of solutions for every update of the search model. However, in reinforcement learning, direct policy search often offers only sequential policy evaluation. Thus, classificationbased optimization is not efficient for direct policy search where solutions have to be sampled sequentially. In this paper, we adapt the classification-based optimization for sequential sampled solutions by forming the batch of reused historical solutions. Experiments on helicopter hovering control task and reinforcement learning benchmark tasks in OpenAI Gym show that the new algorithm is superior to state-of-the-art derivative-free optimization approaches.

air transportation, algorithm, optimization problem, (19 more...)

AAAI Conferences

Feb-14-2017

Conferences PDF

Add feedback

Country:
- Asia (0.28)
- North America > United States
  - Massachusetts > Middlesex County (0.14)

Industry:
- Transportation > Air (0.69)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (1.00)
  - Representation & Reasoning > Optimization (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found