The Online Coupon-Collector Problem and Its Application to Lifelong Reinforcement Learning

Sep-21-2015–arXiv.org Artificial Intelligence

Transferring knowledge across a sequence of related tasks is an important challenge in reinforcement learning (RL). Despite much encouraging empirical evidence, there has been little theoretical analysis. In this paper, we study a class of lifelong RL problems: the agent solves a sequence of tasks modeled as finite Markov decision processes (MDPs), each of which is from a finite set of MDPs with the same state/action sets and different transition/reward functions. Motivated by the need for cross-task exploration in lifelong learning, we formulate a novel online coupon-collector problem and give an optimal algorithm. This allows us to develop a new lifelong RL algorithm, whose overall sample complexity in a sequence of tasks is much smaller than single-task learning, even if the sequence of tasks is generated by an adversary. Benefits of the algorithm are demonstrated in simulated problems, including a recently introduced human-robot interaction problem.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

Sep-21-2015

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.67)

Genre:
- Research Report (0.82)
- Workflow (0.74)

Industry:
- Education (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found