On The Statistical Complexity of Offline Decision-Making
Nguyen-Tang, Thanh, Arora, Raman
We study the statistical complexity of offline Nevertheless, learning good policies from offline data decision-making with function approximation, presents a unique challenge not present in online decisionmaking: establishing (near) minimax-optimal rates for distributional shift. In essence, the policy that stochastic contextual bandits and Markov decision interacts with the environment and collects data differs from processes. The performance limits are captured by the target policy we aim to learn. This challenge becomes the pseudo-dimension of the (value) function class more pronounced in real-world problems with large state and a new characterization of the behavior policy spaces, where it necessitates function approximation to generalize that strictly subsumes all the previous notions of from observed states to unseen ones.
Jan-10-2025
- Country:
- Europe > Austria (0.28)
- Asia > Middle East (0.28)
- Genre:
- Research Report (0.50)
- Industry:
- Leisure & Entertainment > Games (0.46)
- Technology: