Faster saddle-point optimization for solving large-scale Markov decision processes
Bas-Serrano, Joan, Neu, Gergely
We consider the problem of computing optimal policies in average-r eward Markov decision processes. This classical problem can be formulated as a linear program dire ctly amenable to saddle-point optimization methods, albeit with a number of variables that is linear in the n umber of states. T o address this issue, recent work has considered a linearly relaxed version of the res ulting saddle-point problem. Our work aims at achieving a better understanding of this relaxed optimization pro blem by characterizing the conditions necessary for convergence to the optimal policy, and designing a n optimization algorithm enjoying fast convergence rates that are independent of the size of the state s pace.
Sep-22-2019
- Country:
- North America > United States
- Europe > Spain
- Catalonia > Barcelona Province > Barcelona (0.04)
- Asia > Middle East
- Jordan (0.04)
- Genre:
- Research Report (0.50)