The complexity of non-stationary reinforcement learning