bb57db42f77807a9c5823bd8c2d9aaef-Paper.pdf

Neural Information Processing Systems 

We study the reinforcement learning problem for discounted Markov Decision Processes(MDPs)underthetabularsetting.