Some remarks on gradient dominance and LQR policy optimization

Open in new window