Advances in Asynchronous Parallel and Distributed Optimization

Assran, Mahmoud, Aytekin, Arda, Feyzmahdavian, Hamid, Johansson, Mikael, Rabbat, Michael

arXiv.org Machine Learning 

Since the slowing of Moore's scaling law, parallel and distributed computing have become a primary means to solve large computational problems. Much of the work on parallel and distributed optimization during the past decade has been motivated by machine learning applications. The goal of fitting a predictive model to a dataset is formulated as an optimization problem that involves finding the model parameters that provide the best predictive performance. During the same time, advances in machine learning have been enabled by the availability of ever larger datasets and the ability to use larger models, resulting in optimization problems potentially involving billions of free parameters and billions of data samples [1-3]. There are two general scenarios where the use of parallel computing resources naturally arises. In one scenario, the data is available in one central location (e.g., a data center), and the aim is to use parallel computing to train a model faster than would be possible using serial methods. The ideal outcome is to find a parallel method that achieves linear scaling, where the time to achieve a solution of a particular quality decreases proportionally to the number of processors used; i.e., doubling the number of parallel processors reduces the compute time by half. However, unlike serial methods, parallel optimization methods generally require coordination or communication among multiple processors.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found