Robust Value Function Approximation Using Bilinear Programming
–Neural Information Processing Systems
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose approximate bilinear programming, a new formulation of value function approximation that provides strong a priori guarantees. In particular, it provably finds an approximate value function that minimizes the Bellman residual. Solving a bilinear program optimally is NP hard, but this is unavoidable because the Bellman-residual minimization itself is NP hard. We, therefore, employ and analyze a common approximate algorithm for bilinear programs.
Neural Information Processing Systems
Apr-6-2023, 14:07:15 GMT
- Technology: