Review for NeurIPS paper: Improving Neural Network Training in Low Dimensional Random Bases

Neural Information Processing Systems 

Weaknesses: What I found most worrying about this paper is that the FPD CIFAR-10 results does not seem to be consistent with the FPD paper [23]. In [23] the FPD appears to be able to achieve 90% of the original performance with 20 fold reduction of the parameters for the LeNet model (Table 1 in [23]), while Table 1 of this manuscript gets only 60% of performance with only 10 fold reduction of parameters. Similarly, [23] mentions that the ResNet appears to be more parameter efficient than the LeNet architecture, which indicates that FPD should generally work much better in this case. This makes me wonder if there is some underlying issue in the author's implementation? If so, it might be possible that if the FPD baseline is fixed, the observed improvement of the RBD method would not hold?