Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs

Open in new window