Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time