On Reward-Free Reinforcement Learning with Linear Function Approximation

Neural Information Processing Systems 

During the exploration phase, an agent collects samples without using a pre-specified reward function.