df22a19686a558e74f038e6277a51f68-Paper-Conference.pdf

Neural Information Processing Systems 

In the classical decision-making literature, this is achieved by two interweaving processes, policyevaluation and policyimprovement (Sutton and Barto,2018).