Visual Transfer for Reinforcement Learning via Wasserstein Domain Confusion

Jun-4-2020–arXiv.org Machine Learning

We introduce Wasserstein Adversarial Proximal Policy Optimization (WAPPO), a novel algorithm for visual transfer in Reinforcement Learning that explicitly learns to align the distributions of extracted features between a source and target task. WAPPO approximates and minimizes the Wasserstein-1 distance between the distributions of features from source and target domains via a novel Wasserstein Confusion objective. WAPPO outperforms the prior state-of-the-art in visual transfer and successfully transfers policies across Visual Cartpole and two instantiations of 16 OpenAI Procgen environments.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

Jun-4-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States > Rhode Island > Providence County > Providence (0.04)

Genre:
- Research Report (0.50)

Industry:
- Leisure & Entertainment > Games > Computer Games (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Neural Networks > Deep Learning (0.49)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found