AITopics | elastic averaging sgd

Collaborating Authors

elastic averaging sgd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep learning with Elastic Averaging SGD

Neural Information Processing SystemsAug-12-2025, 23:28:57 GMT

We study the problem of stochastic optimization for deep learning in the parallel computing environment under communication constraints. A new algorithm is proposed in this setting where the communication and coordination of work among concurrent processes (local workers), is based on an elastic force which links the parameters they compute with a center variable stored by the parameter server (master). The algorithm enables the local workers to perform more exploration, i.e. the algorithm allows the local variables to fluctuate further from the center variable by reducing the amount of communication between local workers and the master. We empirically demonstrate that in the deep learning setting, due to the existence of many local optima, allowing more exploration can lead to the improved performance. We propose synchronous and asynchronous variants of the new algorithm.

algorithm, deep learning, learning, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsFeb-8-2025, 03:24:05 GMT

This paper studies general-purpose training algorithms for deep learning and proposes a family of algorithms called elastic averaging SGD. The idea is novel and the paper is of very high quality. The paper focuses on training large-scale deep learning models under communication constraints. This problem is difficult since there are many local optima in non-convex problems like in deep learning. The optimization problem is formulated as a global variable consensus problem such that local workers would not fall into different local optima, and then its gradient update rules are reinterpreted using the elastic forces between local and global parameters.

algorithm, author feedback and meta-review, easgd, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep learning with Elastic Averaging SGD

Zhang, Sixin, Choromanska, Anna E., LeCun, Yann

Neural Information Processing SystemsFeb-14-2020, 06:59:31 GMT

algorithm, elastic averaging sgd, learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback