AITopics | qsparse-local-sgd

Collaborating Authors

qsparse-local-sgd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification and Local Computations

Debraj Basu, Deepesh Data, Can Karakus, Suhas Diggavi

Neural Information Processing SystemsFeb-14-2026, 07:31:30 GMT

Neural Information Processing Systems http://nips.cc/

local computation, qsparse-local-sgd, sparsification, (12 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada (0.04)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

94cb02feb750f20bad8a85dfe7e18d11-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 09:43:30 GMT

algorithm, qsparse-local-sgd, synchronization, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > Canada (0.04)
Africa > Ghana > Greater Accra > Accra (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification and Local Computations

Debraj Basu, Deepesh Data, Can Karakus, Suhas Diggavi

Neural Information Processing SystemsAug-20-2025, 03:41:12 GMT

Neural Information Processing Systems http://nips.cc/

local computation, qsparse-local-sgd, sparsification, (12 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

94cb02feb750f20bad8a85dfe7e18d11-Supplemental.pdf

Neural Information Processing SystemsAug-15-2025, 04:25:14 GMT

algorithm, qsparse-local-sgd, synchronization, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Africa > Ghana > Greater Accra > Accra (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

94cb02feb750f20bad8a85dfe7e18d11-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 04:25:07 GMT

algorithm, compressor, qsparse-local-sgd, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Africa > Ghana > Greater Accra > Accra (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

CSER: Communication-efficient SGD with Error Reset

Xie, Cong, Zheng, Shuai, Koyejo, Oluwasanmi, Gupta, Indranil, Li, Mu, Lin, Haibin

arXiv.org Machine LearningJul-29-2020

The scalability of Distributed Stochastic Gradient Descent (SGD) is today limited by communication bottlenecks. We propose a novel SGD variant: Communication-efficient SGD with Error Reset, or CSER. The key idea in CSER is first a new technique called "error reset" that adapts arbitrary compressors for SGD, producing bifurcated local models with periodic reset of resulting local residual errors. Second we introduce partial synchronization for both the gradients and the models, leveraging advantages from them. We prove the convergence of CSER for smooth non-convex problems. Empirical results show that when combined with highly aggressive compressors, the CSER algorithms: i) cause no loss of accuracy, and ii) accelerate the training by nearly $10\times$ for CIFAR-100, and by $4.5\times$ for ImageNet.

artificial intelligence, machine learning, qsparse-local-sgd, (16 more...)

arXiv.org Machine Learning

2007.13221

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations

Basu, Debraj, Data, Deepesh, Karakus, Can, Diggavi, Suhas

arXiv.org Machine LearningJun-5-2019

Communication bottleneck has been identified as a significant issue in distributed optimization of large-scale learning models. Recently, several approaches to mitigate this problem have been proposed, including different forms of gradient compression or computing local models and mixing them iteratively. In this paper we propose \emph{Qsparse-local-SGD} algorithm, which combines aggressive sparsification with quantization and local computation along with error compensation, by keeping track of the difference between the true and compressed gradients. We propose both synchronous and asynchronous implementations of \emph{Qsparse-local-SGD}. We analyze convergence for \emph{Qsparse-local-SGD} in the \emph{distributed} setting for smooth non-convex and convex objective functions. We demonstrate that \emph{Qsparse-local-SGD} converges at the same rate as vanilla distributed SGD for many important classes of sparsifiers and quantizers. We use \emph{Qsparse-local-SGD} to train ResNet-50 on ImageNet, and show that it results in significant savings over the state-of-the-art, in the number of bits transmitted to reach target accuracy.

artificial intelligence, comp, machine learning, (17 more...)

arXiv.org Machine Learning

1906.02367

Country:

Europe (1.00)
North America > United States > California (0.45)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback