A Sliding-Window Algorithm for Markov Decision Processes with Arbitrarily Changing Rewards and Transitions

Open in new window