Convergence of Policy Mirror Descent Beyond Compatible Function Approximation

Open in new window