285baacbdf8fda1de94b19282acd23e2-Supplemental.pdf
–Neural Information Processing Systems
Tabular RL: There is a long line of research on the sample complexity and regret for RL in tabular settings. In model-based settings, researchers have tackled continuous spaces via kernel methods, based on either a fixed discretization of the space [21], or more recently, without resorting to discretization [11]. While the latter does learn a data-driven representation of the space via kernels, it requires solving a complex optimization problem at each step, and hence is efficient mainly for finite action sets (more discussion on this is in Section 4). These were tested heuristically with various splitting rules (e.g. We use this result by chaining the Wasserstein distance of various measures together. Unfortunately, the scaling does not hold for the case whendS 2. In this situation we use the fact thatT The result from [46] has corresponding lower bounds, showing that in the worst case scaling with respect todS is inevitable.
Neural Information Processing Systems
Feb-7-2026, 21:23:23 GMT