Communication-Efficient Algorithms for Statistical Optimization

Zhang, Yuchen, Wainwright, Martin J., Duchi, John C.

Dec-31-2012–Neural Information Processing Systems

We study two communication-efficient algorithms for distributed statistical optimization on large-scale data. The first algorithm is an averaging method that distributes the $N$ data samples evenly to $m$ machines, performs separate minimization on each subset, and then averages the estimates. We provide a sharp analysis of this average mixture algorithm, showing that under a reasonable set of conditions, the combined parameter achieves mean-squared error that decays as $\order(N^{-1}+(N/m)^{-2})$. Whenever $m \le \sqrt{N}$, this guarantee matches the best possible rate achievable by a centralized algorithm having access to all $N$ samples. The second algorithm is a novel method, based on an appropriate form of the bootstrap. Requiring only a single round of communication, it has mean-squared error that decays as $\order(N^{-1}+(N/m)^{-3})$, and so is more robust to the amount of parallelization. We complement our theoretical results with experiments on large-scale problems from the Microsoft Learning to Rank dataset.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Dec-31-2012

Conferences PDF

Add feedback

Country:
- North America > United States > California > Alameda County > Berkeley (0.14)

Genre:
- Research Report (0.49)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Statistical Learning (1.00)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
Communication-Efficient Algorithms for Statistical Optimization John C. Duchi

Similar Docs Excel Report more

Title	Similarity	Source
None found