On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes

Open in new window