The Sample Complexity of Online Reinforcement Learning: A Multi-model Perspective