Goto

Collaborating Authors

 fqf



the paper to improve clarity

Neural Information Processing Systems

Experiment Details: Figure 3 shows the general architecture of FQF. 'inefficient hyperparameter' we mean the atom locations (uniformly distributed between -10 and 10) in C51, quantiles FQF requires only the number of quantiles. We leave the theoretical analysis and comparison between IQN and FQF to future work. We will check if we can figure out why and explain it in detail. As shown in our toy case, FQF does achieve better distribution approximation.


Reviews: Fully Parameterized Quantile Function for Distributional Reinforcement Learning

Neural Information Processing Systems

POST-REBUTTAL I thank the authors for their detailed response. My main concern was the level of experimental detail provided in the submission, and I'm pleased that the authors have committed to including more of the details implicitly contained within the code in the paper itself. My overall recommendation remains the same; I think the paper should be published, and the strong Atari results will be of interest fairly widely. However, there were a few parts of the response I wasn't convinced by: (1) "(D) Inefficient Hyperparameter": I don't agree with the authors' claim that e.g. QR-DQN requires more hyperparameters than FQF (it seems to me that both algorithmically require the number of quantiles, and the standard hyperparameters associated with network architecture and training beyond that).