Deciding WhattoModel: Value-EquivalentSampling forReinforcementLearning
–Neural Information Processing Systems
Inthiswork,weconsider thescenario where agent limitations may entirely preclude identifying an exactly value-equivalent model, immediately giving rise to a trade-off between identifying a model that is simple enough to learn while only incurring bounded sub-optimality.
Neural Information Processing Systems
Apr-25-2026, 14:06:13 GMT