A Cost-Shaping LP for Bellman Error Minimization with Performance Guarantees
Farias, Daniela D., Roy, Benjamin V.
–Neural Information Processing Systems
We introduce a new algorithm based on linear programming that approximates the differential value function of an average-cost Markov decision process via a linear combination of pre-selected basis functions. The algorithm carries out a form of cost shaping and minimizes a version of Bellman error. We establish an error bound that scales gracefully with the number of states without imposing the (strong) Lyapunov condition required by its counterpart in [6]. We propose a path-following method that automates selection of important algorithm parameters which represent counterparts to the "state-relevance weights" studied in [6].
Neural Information Processing Systems
Dec-31-2005
- Country:
- North America > United States
- Massachusetts (0.04)
- California > Santa Clara County
- Palo Alto (0.04)
- North America > United States
- Technology: