Efficient-Adam: Communication-Efficient Distributed Adam

Chen, Congliang, Shen, Li, Liu, Wei, Luo, Zhi-Quan

Aug-24-2023–arXiv.org Artificial Intelligence

Distributed adaptive stochastic gradient methods have been widely used for large-scale nonconvex optimization, such as training deep learning models. However, their communication complexity on finding $\varepsilon$-stationary points has rarely been analyzed in the nonconvex setting. In this work, we present a novel communication-efficient distributed Adam in the parameter-server model for stochastic nonconvex optimization, dubbed {\em Efficient-Adam}. Specifically, we incorporate a two-way quantization scheme into Efficient-Adam to reduce the communication cost between the workers and server. Simultaneously, we adopt a two-way error feedback strategy to reduce the biases caused by the two-way quantization on both the server and workers, respectively. In addition, we establish the iteration complexity for the proposed Efficient-Adam with a class of quantization operators, and further characterize its communication complexity between the server and workers when an $\varepsilon$-stationary point is achieved. Finally, we apply Efficient-Adam to solve a toy stochastic convex optimization problem and train deep learning models on real-world vision and language tasks. Extensive experiments together with a theoretical guarantee justify the merits of Efficient Adam.

artificial intelligence, efficient-adam, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Aug-24-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Oregon
    - Multnomah County > Portland (0.04)
  - Canada > Ontario
    - Toronto (0.14)
- Asia > China
  - Guangdong Province > Shenzhen (0.05)
  - Hong Kong (0.04)
  - Beijing > Beijing (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found