Review for NeurIPS paper: Identifying Learning Rules From Neural Network Observables

Neural Information Processing Systems 

Weaknesses: This paper has stuck with me, and I do want to emphasize just how interesting I find it. I am very much in favor of it, but the following list of weaknesses is holding me back from backing its acceptance. Broadly, I need more convincing that (1) discrimination is not trivially due to differences in learning alg performance, (2) how learning algorithm vs. architecture can ever be dissociated in model organisms, and (3) *why* would differences at the level of weights (fig 3) be indicative of different learning algorithms in a way that cannot be deduced via first principles (related to point 2). I am suspicious that the ability to discriminate between learning algorithm is driven by differences in their performance on imagenet. While it's not obvious to me *how* this would work, it seems plausible that learning algorithm differences e.g. I think that the authors need to control for this.