Review for NeurIPS paper: A Dynamical Central Limit Theorem for Shallow Neural Networks

Neural Information Processing Systems 

Weaknesses: Proposition 2.1 is tangential and not new in content or proof technique; much the same was shown in, eg, [Mei, Misiakiewicz, Montanari '19] and other works building upon it. The proofs of Propositions 3.1 and 3.2, the most meaningful results, are simple calculations making use of the Mean Value Theorem and Duhamel's principle, respectively. Theorem 3.3 is a lot of work for what is not a particularly interesting result: it is asymptotic in both n and t, so yields no insight into dynamics, or any relationship between n and t. Moreover it is not truly dimension-free as the authors claim; dimension implicitly shows up in the moments of f, given that \psi is positively homogeneous, so it is rather the case that dimension shows up how one might expect in a variance bound. Moreover (and this is made clearer upon examination of experimental results), it is not useful to reason about optimization time (finite or asymptotic) without reference to a discretization scheme. The experiments make reference to "epochs", but there is no optimization algorithm to relate to the flows discussed in the theory.