Global Convergence Analysis of Local SGD for Two-layer Neural Network without Overparameterization

Oct-8-2025, 15:52:52 GMT–Neural Information Processing Systems

Local SGD, a cornerstone algorithm in federated learning, is widely used in training deep neural networks and shown to have strong empirical performance. A theoretical understanding of such performance on nonconvex loss landscapes is currently lacking. Analysis of the global convergence of SGD is challenging, as the noise depends on the model parameters. Indeed, many works narrow their focus to GD and rely on injecting noise to enable convergence to the local or global optimum.

artificial intelligence, deep learning, machine learning, (15 more...)

Neural Information Processing Systems

Oct-8-2025, 15:52:52 GMT

Conferences PDF

Add feedback

Country:
- North America > United States
  - Virginia (0.04)
  - Wisconsin > Dane County
    - Madison (0.04)
  - Massachusetts > Suffolk County
    - Boston (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China > Shanghai
    - Shanghai (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Duplicate Docs Excel Report

Title
4dade38eae8c007f3a564b8ea820664a-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found