Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes