Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
–Neural Information Processing Systems
In reinforcement learning problems, the aim is to select a controller that will maximize the average reward in some environment.
Neural Information Processing Systems
Apr-6-2023, 16:42:39 GMT