DUAL-LOCO: Distributing Statistical Estimation Using Random Projections
Heinze, Christina, McWilliams, Brian, Meinshausen, Nicolai
We present Dual-Loco, a communicationefficient algorithm for distributed statistical estimation. Dual-Loco assumes that the data is distributed across workers according to the features rather than the samples. It requires only a single round of communication where low-dimensional random projections are used to approximate the dependencies between features available to different workers. We show that Dual-Loco has bounded approximation error which only depends weakly on the number of workers. We compare Dual-Loco against a state-of-theart distributed optimization method on a variety of real world datasets and show that it obtains better speedups while retaining good accuracy. In particular, Dual-Loco allows for fast cross validation as only part of the algorithm depends on the regularization parameter.
Jan-8-2016
- Country:
- Pacific Ocean (0.04)
- North America > United States
- Virginia (0.04)
- Europe
- Switzerland > Zürich
- Zürich (0.04)
- Spain > Andalusia
- Cádiz Province > Cadiz (0.04)
- Switzerland > Zürich
- Asia > Middle East
- Jordan (0.04)
- Genre:
- Research Report (0.82)
- Technology: