standard pomdp
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
Reviews: rho-POMDPs have Lipschitz-Continuous epsilon-Optimal Value Functions
The paper addresses the problem of rho-POMDPs non-convex reward functions, proving that indeed under some cases they, and their resulting value functions, are Lipschitz-continuous (LC) for finite horizons. The paper also proposes and uses a more general vector form of LC, too. This result allows value function approximations of the optimal V * to be used, as well as upper and lower bounds (U and L) on value as in HSVI, and a wide array of new algorithms to be developed. This is analogous to the PWLC result for standard POMDPs, as LC is more general, allowing for similar contraction operators with Banach's fixed point theorem as in (PO)MDPs, and finite horizon approximations of the infinite horizon objective criteria. Once the paper establishes the main result, it discusses approximations of U and L using min or max, respectively, over sets of cones.