Refined Regret for Adversarial MDPs with Linear Function Approximation

Open in new window