Implicit Regularization of Decentralized Gradient Descent for Sparse Regression

May-26-2025, 18:06:42 GMT–Neural Information Processing Systems

We consider learning a sparse model from linear measurements taken by a network of agents. Different from existing decentralized methods designed based on the LASSO regression with explicit \ell_1 norm regularization, we exploit the implicit regularization of decentralized optimization method applied to an over-parameterized nonconvex least squares formulation without penalization. Our first result shows that despite nonconvexity, if the network connectivity is good, the well-known decentralized gradient descent algorithm (DGD) with small initialization and early stopping can compute the statistically optimal solution. Sufficient conditions on the initialization scale, choice of step size, network connectivity, and stopping time are further provided to achieve convergence. Our result recovers the convergence rate of gradient descent in the centralized setting, showing its tightness.

artificial intelligence, decentralized gradient descent, machine learning, (4 more...)

Neural Information Processing Systems

May-26-2025, 18:06:42 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report > New Finding (0.63)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Statistical Learning
    - Gradient Descent (0.90)