The True Impact of Baselines in Policy Gradient Methods – Marlos C. Machado
I have been working with policy gradient (PG) methods for quite some time now and I thought I should continue sharing our findings here. In June, we put out a paper on how we could see PG methods from an operators perspective. I even wrote a blog post on this. Today I'm going to talk about a paper we put out last month about the role of baselines in PG methods. As in my previous post, let me first talk about the standard view of PG methods.
Oct-22-2020, 19:08:00 GMT