Logarithmic Switching Cost in Reinforcement Learning beyond Linear MDPs