AITopics | evolved policy gradient

Collaborating Authors

evolved policy gradient

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evolved Policy Gradients

Neural Information Processing SystemsNov-20-2025, 22:27:40 GMT

We propose a metalearning approach for learning gradient-based reinforcement learning (RL) algorithms. The idea is to evolve a differentiable loss function, such that an agent, which optimizes its policy to minimize this loss, will achieve high rewards. The loss is parametrized via temporal convolutions over the agent's experience. Because this loss is highly flexible in its ability to take into account the agent's history, it enables fast task learning. Empirical results show that our evolved policy gradient algorithm (EPG) achieves faster learning on several randomized environments compared to an off-the-shelf policy gradient method. We also demonstrate that EPG's learned loss can generalize to out-of-distribution test time tasks, and exhibits qualitatively different behavior from other popular metalearning algorithms.

algorithm, evolved policy gradient, name change, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.62)

Add feedback

Reviews: Evolved Policy Gradients

Neural Information Processing SystemsOct-7-2024, 14:26:44 GMT

The authors present an approach for learning loss functions for reinforcement learning via a combination of evolutionary strategies as an outer loop and a simple policy gradient algorithm in the inner loop. Overall I found this to be a very interesting paper. My one criticism is that I would have liked to see a bit more of a study of what parts of the algorithm and the loss architecture are important. The algorithm itself is relatively simple. Although I appreciate the detail of Algorithm 1, to some degree I feel that this obscures the algorithm. In essense this approach corresponds to "use policy gradient in the inner-loop, and ES in the outer loop".More interesting is the structure of the loss architecture.

algorithm, architecture, evolved policy gradient, (3 more...)

Neural Information Processing Systems

Genre: Summary/Review (0.39)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

OpenAI Brings Introspection To Reinforcement Learning Agents - AI Summary

#artificialintelligenceApr-7-2022, 01:46:35 GMT

Recently, researchers from OpenAI published a new paper that proposes a method to address this challenge by creating RL models that know what it means to make progress on a new task, by having experienced making progress on similar tasks in the past. Titled Evolved Policy Gradients(EPG), the OpenAI research paper introduces new meta-learning technique based on the concept of a loss function that qualifies the learning progress. When used in RL models, the EPG method does not encode the knowledge explicitly through memorized behaviors but, instead, it uses an implicitly mechanism through a learned loss function. The EPG end goal is that RL agents that can use this loss function to learn a novel task. In initial tests, EPG seems to improves on standard RL algorithms by allowing the loss function to be adaptive to the environment and agent history, leading to faster learning and the potential for learning without external rewards.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.93)

Add feedback

Evolved Policy Gradients

Houthooft, Rein, Chen, Yuhua, Isola, Phillip, Stadie, Bradly, Wolski, Filip, Ho, OpenAI Jonathan, Abbeel, Pieter

Neural Information Processing SystemsFeb-14-2020, 16:44:51 GMT

agent, algorithm, evolved policy gradient

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Add feedback

Evolved Policy Gradients

@machinelearnbotApr-20-2018, 12:30:22 GMT

We're releasing an experimental metalearning approach called Evolved Policy Gradients, a method that evolves the loss function of learning agents, which can enable fast training on novel tasks. Agents trained with EPG can succeed at basic tasks at test time that were outside their training regime, like learning to navigate to an object on a different side of the room from where it was placed during training. EPG trains agents to have a prior notion of what constitutes making progress on a novel task. Rather than encoding prior knowledge through a learned policy network, EPG encodes it as a learned loss function[1]. Agents are then able to use this loss function, defined as a temporal-convolutional neural network, to learn quickly on a novel task. We've shown that EPG can generalize to out of distribution test time tasks, exhibiting behavior qualitatively different from other popular metalearning algorithms.

artificial intelligence, loss function, machine learning, (15 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback