Distributed Learning over Arbitrary Topology: Linear Speed-Up with Polynomial Transient Time

Mar-20-2025–arXiv.org Artificial Intelligence

We study a distributed learning problem in which $n$ agents, each with potentially heterogeneous local data, collaboratively minimize the sum of their local cost functions via peer-to-peer communication. We propose a novel algorithm, Spanning Tree Push-Pull (STPP), which employs two spanning trees extracted from a general communication graph to distribute both model parameters and stochastic gradients. Unlike prior approaches that rely heavily on spectral gap properties, STPP leverages a more flexible topological characterization, enabling robust information flow and efficient updates. Theoretically, we prove that STPP achieves linear speedup and polynomial transient iteration complexity, up to $O(n^7)$ for smooth nonconvex objectives and $\tilde{O}(n^3)$ for smooth strongly convex objectives, under arbitrary network topologies. Moreover, compared with the existing methods, STPP achieves faster convergence rates on sparse and non-regular topologies (e.g., directed ring) and reduces communication overhead on dense networks (e.g., static exponential graph). These results significantly advance the state of the art, especially when $n$ is large. Numerical experiments further demonstrate the strong performance of STPP and confirm the practical relevance of its theoretical convergence rates across various common graph architectures. Our code is available at https://anonymous.4open.science/r/SpanningTreePushPull-5D3E.

artificial intelligence, machine learning, polynomial transient time algorithm, (15 more...)

arXiv.org Artificial Intelligence

Mar-20-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China
  - Hong Kong (0.04)
  - Guangdong Province > Shenzhen (0.04)

Genre:
- Research Report (0.50)

Industry:
- Education (0.34)

Technology:
- Information Technology
  - Communications > Networks (0.91)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Machine Learning > Statistical Learning
      - Gradient Descent (0.35)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found