Optimistic Active Exploration of Dynamical Systems

Jan-19-2025, 09:01:40 GMT–Neural Information Processing Systems

Reinforcement learning algorithms commonly seek to optimize policies for solving one particular task. How should we explore an unknown dynamical system such that the estimated model allows us to solve multiple downstream tasks in a zero-shot manner? In this paper, we address this challenge, by developing an algorithm -- OPAX -- for active exploration. OPAX uses well-calibrated probabilistic models to quantify the epistemic uncertainty about the unknown dynamics. We show how the resulting optimization problem can be reduced to an optimal control problem that can be solved at each episode using standard approaches.

machine learning, optimistic active exploration, reinforcement learning, (5 more...)

Neural Information Processing Systems

Jan-19-2025, 09:01:40 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.63)