Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods

Open in new window