A Function Approximation Approach to Estimation of Policy Gradient for POMDP with Structured Policies

Open in new window