An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap

Open in new window