Minimax-Bayes Reinforcement Learning