Improved Regret Bounds for Linear Adversarial MDPs via Linear Optimization

Open in new window