DUAL-LOCO: Distributing Statistical Estimation Using Random Projections

Heinze, Christina, McWilliams, Brian, Meinshausen, Nicolai

arXiv.org Machine Learning 

We present Dual-Loco, a communicationefficient algorithm for distributed statistical estimation. Dual-Loco assumes that the data is distributed across workers according to the features rather than the samples. It requires only a single round of communication where low-dimensional random projections are used to approximate the dependencies between features available to different workers. We show that Dual-Loco has bounded approximation error which only depends weakly on the number of workers. We compare Dual-Loco against a state-of-theart distributed optimization method on a variety of real world datasets and show that it obtains better speedups while retaining good accuracy. In particular, Dual-Loco allows for fast cross validation as only part of the algorithm depends on the regularization parameter.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found