comments. 2 Response to Reviewer

Neural Information Processing Systems 

We thank all the reviewers for their time and valuable feedback. The purpose of Thm 3.2 is to clarify how the kernel We tend to think it as "the kernel whose kernel norm equals our kernel Meanwhile, we believe that we can develop results similar to our Corollary 3.3 to explicitly clarify the concrete relation We will discuss this extensively in the revision. The properties of the empirical loss are not shown. We agree with the reviewer's comments on the issue of biasedness But the anslysis for the non-IID case is quite technical, and can be a distraction of this work's main focus. Therefore, we prefer to study it in a separate work that focuses on statistical guarantees and uncertainty estimation. They plays orthogonal roles, so it is not easy to say which is more important. "Kernel-Based Reinforcement Learning" is not the same as the more general kernel methods used in the paper and other It is mentioned in the paper as a related work, and we will make the distinction explicit. Thm 3.2 is meant to clarify how the kernel Bellman loss is related to the error We will consider to reform Theorem 3.2 into a "Dual kernel Fig2(d) is similar, but plot the ( Bellman-error, K-loss) and ( Bellman-error, L2-loss) pairs.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found