Provably Convergent Off-Policy Actor-Critic with Function Approximation

Open in new window