AITopics | gosgd

Collaborating Authors

gosgd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GoSGD: Distributed Optimization for Deep Learning with Gossip Exchange

Blot, Michael, Picard, David, Cord, Matthieu

arXiv.org Machine LearningApr-4-2018

With deep convolutional neural networks (CNN) introduced by [1] and [2], computer vision tasks and more specifically image classification have made huge improvements in the years following [3]. CNN performances benefit a lot from big collections of annotated images like [4] or [5]. They are trained by optimizing a loss function with gradient descents computed on random mini-batches according to [6]. The method called stochastic gradient descent (SGD) has proved to be very efficient to train neural networks in general. However current CNN structures are extremely deep like the 100 layers ResNet of [7] and contains a lot of parameters (around 60M for Alexnet [3] and 130M for vgg [8]). Those structures involve heavy gradient computation times making the training on big data-sets very slow. Computation on GPU accelerates the training but requires huge local memory caches. Nevertheless the mini-batch optimization seems suitable for distributing the training over several threads. Many methods have been proposed like 1 [9, 10], which propose to distribute the batches over different threads called workers that periodically exchange information via a central thread to synchronize their models.

communication, gosgd, persyn, (12 more...)

arXiv.org Machine Learning

1804.01852

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > France > Île-de-France > Yvelines > Cergy-Pontoise (0.04)
Europe > France > Île-de-France > Val-d'Oise > Cergy-Pontoise (0.04)
(2 more...)

Genre: Research Report (0.52)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.92)

Add feedback