Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off

Jan-19-2025, 21:35:47 GMT–Neural Information Processing Systems

A default assumption in reinforcement learning (RL) and optimal control is that observations arrive at discrete time points on a fixed clock cycle. Yet, many applications involve continuous-time systems where the time discretization, in principle, can be managed. The impact of time discretization on RL methods has not been fully characterized in existing theory, but a more detailed analysis of its effect could reveal opportunities for improving data-efficiency. We address this gap by analyzing Monte-Carlo policy evaluation for LQR systems and uncover a fundamental trade-off between approximation and statistical error in value estimation. Importantly, these two errors behave differently to time discretization, leading to an optimal choice of temporal resolution for a given data budget.

continuous value estimation, fundamental trade-off, time discretization, (3 more...)

Neural Information Processing Systems

Jan-19-2025, 21:35:47 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.44)