Multi-objective Model-based Policy Search for Data-efficient Learning with Sparse Rewards

Kaushik, Rituraj, Chatzilygeroudis, Konstantinos, Mouret, Jean-Baptiste

Jun-25-2018–arXiv.org Artificial Intelligence

The most data-efficient algorithms for reinforcement learning in robotics are model-based policy search algorithms, which alternate between learning a dynamical model of the robot and optimizing a policy to maximize the expected return given the model and its uncertainties. However, the current algorithms lack an effective exploration strategy to deal with sparse or misleading reward scenarios: if they do not experience any state with a positive reward during the initial random exploration, it is very unlikely to solve the problem. Here, we propose a novel model-based policy search algorithm, Multi-DEX, that leverages a learned dynamical model to efficiently explore the task space and solve tasks with sparse rewards in a few episodes. To achieve this, we frame the policy search problem as a multi-objective, model-based policy optimization problem with three objectives: (1) generate maximally novel state trajectories, (2) maximize the expected return and (3) keep the system in state-space regions for which the model is as accurate as possible. We then optimize these objectives using a Pareto-based multi-objective optimization algorithm. The experiments show that Multi-DEX is able to solve sparse reward scenarios (with a simulated robotic arm) in much lower interaction time than VIME, TRPO, GEP-PG, CMA-ES and Black-DROPS.

optimization problem, trajectory, upstream oil & gas, (19 more...)

arXiv.org Artificial Intelligence

Jun-25-2018

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)

Genre:
- Research Report > Promising Solution (0.34)

Industry:
- Energy > Oil & Gas
  - Upstream (0.66)
- Leisure & Entertainment (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Evolutionary Systems (1.00)
  - Representation & Reasoning
    - Optimization (1.00)
    - Search (0.89)
  - Robots (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found