Diverse Exploration via InfoMax Options

Oct-6-2020–arXiv.org Artificial Intelligence

In this paper, we study the problem of autonomously discovering temporally abstracted actions, or options, for exploration in reinforcement learning. For learning diverse options suitable for exploration, we introduce the infomax termination objective defined as the mutual information between options and their corresponding state transitions. We derive a scalable optimization scheme for maximizing this objective via the termination condition of options, yielding the InfoMax Option Critic (IMOC) algorithm. Through illustrative experiments, we empirically show that IMOC learns diverse options and utilizes them for exploration. Moreover, we show that IMOC scales well to continuous control tasks.

artificial intelligence, neural network, objective, (17 more...)

arXiv.org Artificial Intelligence

Oct-6-2020

arXiv.org PDF

Add feedback

Country:
- Asia > Japan
  - Honshū > Kantō (0.14)
- Europe (0.68)
- North America
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.14)
  - United States
    - California
      - Los Angeles County > Long Beach (0.14)
      - San Francisco County > San Francisco (0.14)
    - New York > New York County
      - New York City (0.14)

Genre:
- Research Report (0.64)

Industry:
- Education (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks (0.93)
    - Reinforcement Learning (1.00)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found