Meta-Gradient Reinforcement Learning with an Objective Discovered Online

May-27-2025, 09:07:47 GMT–Neural Information Processing Systems

Deep reinforcement learning includes a broad family of algorithms that parameterise an internal representation, such as a value function or policy, by a deep neural network. Each algorithm optimises its parameters with respect to an objective, such as Q-learning or policy gradient, that defines its semantics. In this work, we propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network, solely from interactive experience with its environment. Over time, this allows the agent to learn how to learn increasingly effectively. Furthermore, because the objective is discovered online, it can adapt to changes over time.

artificial intelligence, machine learning, meta-gradient reinforcement learning, (3 more...)

Neural Information Processing Systems

May-27-2025, 09:07:47 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Neural Networks (1.00)