Reviews: Privacy-Preserving Q-Learning with Functional Noise in Continuous Spaces

Neural Information Processing Systems 

The definition of two neighboring reward functions is provided in Theorem 5. The authors did not explain the motivation of the guarantee of privacy for reward function clearly. It would be better if the authors could interpret the necessities of the privacy of reward function in some real application situations. What is the reason for adding noise like line 19-20 of the Algorithm 1? The definitions of g _k[B][2] in line 4 and g _a[:][1] in line 15 are not given.