What Fundamental Structure in Reward Functions Enables Efficient Sparse-Reward Learning?

Open in new window