Adaptive Exploration for Data-Efficient General Value Function Evaluations

Mar-21-2026, 05:39:20 GMT–Neural Information Processing Systems

General Value Functions (GVFs) (Sutton et al., 2011) represent predictive knowledge in reinforcement learning. Each GVF computes the expected return for a given policy, based on a unique reward. Existing methods relying on fixed behavior policies or pre-collected data often face data efficiency issues when learning multiple GVFs in parallel using off-policy methods. To address this, we introduce, which adaptively learns a single behavior policy that efficiently collects data for evaluating multiple GVFs in parallel.

artificial intelligence, machine learning, reinforcement learning, (6 more...)

Neural Information Processing Systems

Mar-21-2026, 05:39:20 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.41)