Appendix Potential Negative Societal Impacts

Apr-25-2026, 19:26:26 GMT–Neural Information Processing Systems

C.3 Other Differences Besides the above discussion, there are some other differences between Daniely [12] and our work. First, they analyze SGD, and we analyze a constrained optimization problem and projected SGD. This may be the reason why we can get a stronger bound on width. In the experiments in Section 5, we observe that SGD performs badly when the width is small (see the first left column in (b), Figure 4). Therefore, we suspect an algorithmic change is needed to train narrow nets with such width (due to the training difficulty), and we indeed propose a new method to train narrow nets. Second, they consider binary {+1, 1}dataset, while our results apply to arbitrary labels. In addition, their proof seems to be highly dependent on the fact that the labels are {+1, 1}, and seems hard to generalize to general labels.

artificial intelligence, machine learning, training regime, (18 more...)

Neural Information Processing Systems

Apr-25-2026, 19:26:26 GMT

Conferences PDF

Add feedback

Genre:
- Research Report > New Finding (0.48)

Industry:
- Social Sector (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (1.00)
  - Representation & Reasoning > Optimization (0.68)

Duplicate Docs Excel Report

Title
Appendix PotentialNegativeSocietalImpacts

Similar Docs Excel Report more

Title	Similarity	Source
None found