Reset-Free Lifelong Learning with Skill-Space Planning

Lu, Kevin, Grover, Aditya, Abbeel, Pieter, Mordatch, Igor

Dec-7-2020–arXiv.org Artificial Intelligence

The objective of lifelong reinforcement learning (RL) is to optimize agents which can continuously adapt and interact in changing environments. However, current RL approaches fail drastically when environments are non-stationary and interactions are non-episodic. We propose Lifelong Skill Planning (LiSP), an algorithmic framework for non-episodic lifelong RL based on planning in an abstract space of higher-order skills. We learn the skills in an unsupervised manner using intrinsic rewards and plan over the learned skills using a learned dynamics model. Moreover, our framework permits skill discovery even from offline data, thereby reducing the need for excessive real-world interactions. We demonstrate empirically that LiSP successfully enables long-horizon planning and learns agents that can avoid catastrophic failures even in challenging non-stationary and non-episodic environments derived from gridworld and MuJoCo benchmarks.

agent, lisp, sergey levine, (14 more...)

arXiv.org Artificial Intelligence

Dec-7-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (0.46)

Industry:
- Education > Educational Setting > Continuing Education (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (0.70)
  - Representation & Reasoning > Optimization (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found