Optimizing over a Restricted Policy Class in Markov Decision Processes

Open in new window