Robust and Adaptive Temporal-Difference Learning Using An Ensemble of Gaussian Processes