On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization

Open in new window