Off-Policy Average Reward Actor-Critic with Deterministic Policy Search

Open in new window