On The Statistical Complexity of Offline Decision-Making

Nguyen-Tang, Thanh, Arora, Raman

arXiv.org Machine Learning 

We study the statistical complexity of offline Nevertheless, learning good policies from offline data decision-making with function approximation, presents a unique challenge not present in online decisionmaking: establishing (near) minimax-optimal rates for distributional shift. In essence, the policy that stochastic contextual bandits and Markov decision interacts with the environment and collects data differs from processes. The performance limits are captured by the target policy we aim to learn. This challenge becomes the pseudo-dimension of the (value) function class more pronounced in real-world problems with large state and a new characterization of the behavior policy spaces, where it necessitates function approximation to generalize that strictly subsumes all the previous notions of from observed states to unseen ones.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found