Unlocking the Power of Representations in Long-term Novelty-based Exploration

Saade, Alaa, Kapturowski, Steven, Calandriello, Daniele, Blundell, Charles, Sprechmann, Pablo, Sarra, Leopoldo, Groth, Oliver, Valko, Michal, Piot, Bilal

May-2-2023–arXiv.org Artificial Intelligence

We introduce Robust Exploration via Clusteringbased Online Density Estimation (RECODE), a nonparametric method for novelty-based exploration that estimates visitation counts for clusters of states based on their similarity in a chosen embedding space. By adapting classical clustering to the nonstationary setting of Deep RL, RECODE can efficiently track state visitation counts over thousands of episodes. We further propose a novel generalization of the inverse dynamics loss, which leverages masked transformer architectures for multi-step prediction; which in conjunction with RECODE achieves a new state-of-the-art in Figure 1: A key result of RECODE is that it allows us to a suite of challenging 3D-exploration tasks in leverage more powerful state representations for long-term DM-HARD-8. RECODE also sets new state-of-theart novelty estimation; enabling to achieve a new state-of-theart in hard exploration Atari games, and is the first in the challenging 3D task suite DM-HARD-8.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

May-2-2023

arXiv.org PDF

Add feedback

Country:
- Europe > Germany
  - Bavaria > Middle Franconia > Nuremberg (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Leisure & Entertainment > Games > Computer Games (0.54)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Statistical Learning (1.00)
    - Reinforcement Learning (0.95)
    - Neural Networks > Deep Learning (0.48)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found