Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes