Robust Value Function Approximation Using Bilinear Programming

Petrik, Marek, Zilberstein, Shlomo

Neural Information Processing Systems 

Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose approximate bilinear programming, a new formulation of value function approximation thatprovides strong a priori guarantees. In particular, this approach provably finds an approximate value function that minimizes the Bellman residual. Solving a bilinear program optimally is NPhard, but this is unavoidable because the Bellman-residual minimization itself is NPhard. We therefore employ and analyze a common approximate algorithm for bilinear programs. The analysis shows that this algorithm offers a convergent generalization of approximate policy iteration.Finally, we demonstrate that the proposed approach can consistently minimize the Bellman residual on a simple benchmark problem.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found