A Sharp Analysis of Model-based Reinforcement Learning with Self-Play