Unifying Ensemble Methods for Q-learning via Social Choice Theory