Review for NeurIPS paper: Bayesian filtering unifies adaptive and non-adaptive neural network optimization methods

Neural Information Processing Systems 

After a discussion with the reviewers, I converged towards recommending to accept this submission. The reviewers raised the following aspects: 1) The perspective is novel, and has interesting potential. Re 1: all reviewers agree that this is a pro for the paper and should be considered its main strength. The authors agree (rebuttal, lines 23-25). Re 2: R3 believes that questioning the approximations is a valid point. However, as the authors argue, they have provided sufficient empirical evidence for mini-batch Gaussianity in appendix B, and Gaussianity is sometimes assumed without further justification in other Bayesian inference applications as well, simply to keep the computations tractable.