emphasize the technical novelty of our upper bound and lower bound as Reviewer # 1, Reviewer # 3 and Reviewer # 4 2 commented on the technical novelty of our theoretical results
–Neural Information Processing Systems
We thank all the reviewers for their valuable feedback and appreciating our contributions. T echnical novelty of the upper bound. In the exploration phase, Jin et al. [2020] set reward to be To our knowledge, this idea is new in the literature. For example, for the hard instance in [Du et al. 2020], only a single state-action pair has non-zero reward Moreover, we focus on the reward-free setting while Du et al. [2020] focused on the standard RL setting. Below we address specific concerns from each reviewer.
Neural Information Processing Systems
Aug-16-2025, 13:18:20 GMT
- Technology: