Communication-Efficient Distributed Estimator for Generalized Linear Models with a Diverging Number of Covariates

Zhou, Ping, Yu, Zhen, Ma, Jingyi, Tian, Maozai

Jan-17-2020–arXiv.org Machine Learning

Distributed statistical inference has recently attracted immense attention. Herein, we study the asymptotic efficiency of the maximum likelihood estimator (MLE), the one-step MLE, and the aggregated estimating equation estimator for generalized linear models with a diverging number of covariates. Then a novel method is proposed to obtain an asymptotically efficient estimator for large-scale distributed data by two rounds of communication between local machines and the central server. The assumption on the number of machines in this paper is more relaxed and thus practical for real-world applications. Simulations and a case study demonstrate the satisfactory finite-sample performance of the proposed estimators. Keywords: Generalized linear models, Large-scale distributed data, Asymptotic efficiency, One-step MLE, Diverging p MSC: 62J12 1 . Introduction In modern times, large-scale data sets have become increasingly common, and they are often stored across multiple machines. Since communication cost between machines is considerably higher than the cost of conducting statistical analysis on a single machine (Jaggi et al., 2014; Smith et al., 2018), it is inefficient to calculate a global estimator by the transmission of the local data to a central machine. Further, the application of the traditional iterative algorithms in a distributed system, such as the Fisher-scoring algorithm for maximum likelihood estimator (MLE) in generalized linear models (GLMs), cannot avoid multiple rounds of communication that incurs exorbitant costs. Therefore, communication-efficient distributed algorithms must be developed to accommodate the new features of modern data sets.

estimator, max 1, nullnull, (16 more...)

arXiv.org Machine Learning

Jan-17-2020

arXiv.org PDF

Add feedback

Country:
- Asia
  - Middle East > Jordan (0.04)
  - China > Beijing
    - Beijing (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.54)
  - Machine Learning
    - Statistical Learning (0.68)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.54)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found