On the Model-Misspecification in Reinforcement Learning