concerns below (due to space constraints, we focus on the main concerns): 2

Neural Information Processing Systems 

We thank the reviewers for their detailed reviews and constructive feedback. It is not known how tight any of these bounds are. We will clarify this point in the final version. Red lines are GD while blue lines are NGD (Hessian-free). Solid lines are training curves while dashed lines are testing curves.