Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation

Neural Information Processing Systems 

In the exploration phase, the agent interacts with the environment and collects samples without the reward.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found