Review for NeurIPS paper: Deep reconstruction of strange attractors from time series

Neural Information Processing Systems 

The paper considers the setting in which the observed time series is governed by a dynamical system. However, when the problem is cast into a machine learning setup for general time series analysis, this distinction is sometimes lost. This may be a point to mention in the broader impacts section: in many applications it is not known if the time series data of interest is governed by a dynamical system. I have some concerns about this claim: a. Could the authors provide more evidence that the learning rate should not be considered a hyperparameter ("essentially one governing hyperparameter" in line 316)? After all, in lines 189-190 the learning rate is listed as a parameter that is tuned.