Goal-Conditioned Generators of Deep Policies

Faccio, Francesco, Herrmann, Vincent, Ramesh, Aditya, Kirsch, Louis, Schmidhuber, Jürgen

Jul-4-2022–arXiv.org Machine Learning

Goal-conditioned Reinforcement Learning (RL) aims at learning optimal policies, given goals encoded in special command inputs. Here we study goal-conditioned neural nets (NNs) that learn to generate deep NN policies in form of context-specific weight matrices, similar to Fast Weight Programmers and other methods from the 1990s. Using context commands of the form "generate a policy that achieves a desired expected return," our NN generators combine powerful exploration of parameter space with generalization across commands to iteratively find better and better policies. A form of weight-sharing HyperNetworks and policy embeddings scales our method to generate deep NNs. Experiments show how a single learned policy generator can produce policies that achieve any return seen during training. Finally, we evaluate our algorithm on a set of continuous control tasks where it exhibits competitive performance.

generator, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

Jul-4-2022

arXiv.org PDF

Add feedback

Country:
- Europe > Switzerland (0.04)
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - California > San Diego County
    - San Diego (0.04)
- Asia > Middle East
  - Saudi Arabia > Mecca Province > Thuwal (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found