Goto

Collaborating Authors

 Inductive Learning






Title

Neural Information Processing Systems

We prove that early learning and memorization are fundamental phenomena in high-dimensional classification tasks, even in simple linear models, and give a theoretical explanation in this setting.


Clarify Technical Contributions (R3 / R4): 2 Gradient Estimation

Neural Information Processing Systems

We thank all reviewers for their detailed constructive feedback and suggestions. Table B (below) demonstrates this empirically. Gumbel-Softmax has) with significantly less training time and resource consumption. These experiments show that when trained with Gumbel-CRF, the AR decoder outperforms REINFORCE. We will clarify this in the paper.




e038453073d221a4f32d0bab94ca7cee-AuthorFeedback.pdf

Neural Information Processing Systems

R1.1: concerns on compared methods and datasets. The results are shown in Figure i. We will add detailed discussions. We will highlight them in these two tables. In our implementation, we simply use the "negative" loss, i.e., R4.1: compare with robust deep learning.