Reviews: Iterative Value-Aware Model Learning
–Neural Information Processing Systems
The paper proposes a modification of a reinforcement learning (RL) framework, called Value-Aware Model Learning (VAML), that makes the associated optimization problem more tractable. VAML is a model-based approach that takes into account the value function while learning the model. In its original formulation, VAML poses the problem as a "min max" optimization in which one seeks a model considering the worst-case scenario over the space of representable value functions. This paper proposes to replace the problem above with a sequence of optimizations whose objective functions include the actual value-function approximations that arise in value iteration (that is, one replaces the "max" above with a sequence of concrete approximations). The paper presents a theoretical analysis of the proposed method, first providing finite sample guarantees for the model-based approximation, then providing a general error propagation analysis, and finally combining the two.
Neural Information Processing Systems
Oct-7-2024, 14:40:40 GMT
- Technology: