9b8b50fb590c590ffbf1295ce92258dc-AuthorFeedback.pdf
–Neural Information Processing Systems
For example, when solving RL problems such as Atari7 games, we may test different representation methods. Fortheaveragereward30 setting, it is still an open question whether S-bounds areachievable. Ourapproach canbeadapted totheepisodic31 case when the regret bounds would benefit from the improved bounds available in this setting. The A-dependence is optimal as for UCRL2, while the optimal dependence onS is still an open question (also46 for the MDP case). The optimal dependence on|Φ| in our setting is also open.
Neural Information Processing Systems
Feb-13-2026, 03:37:00 GMT
- Technology: