What About Taking Policy as Input of Value Function: Policy-extended Value Function Approximator

Open in new window