Mapping State Space using Landmarks for Universal Goal Reaching

Oct-9-2024, 20:59:38 GMT–Neural Information Processing Systems

An agent that has well understood the environment should be able to apply its skills for any given goals, leading to the fundamental problem of learning the Universal Value Function Approximator (UVFA). A UVFA learns to predict the cumulative rewards between all state-goal pairs. However, empirically, the value function for long-range goals is always hard to estimate and may consequently result in failed policy. This has presented challenges to the learning process and the capability of neural networks. We propose a method to address this issue in large MDPs with sparse rewards, in which exploration and routing across remote states are both extremely challenging.

landmark, mapping state space, universal goal, (3 more...)

Neural Information Processing Systems

Oct-9-2024, 20:59:38 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)