Variance-Constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs

Open in new window