A debiased distributed estimation for sparse partially linear models in diverging dimensions

Aug-17-2017–arXiv.org Machine Learning

Under a big-data setting, the storage and analysis of data can no longer be performed on a single machine, and in this case dividing data into many sub-samples becomes a critical 1 procedure for any numerical algorithm to be implemented. Distributed statistical estimation and distributed optimization have received increasing attention in recent years, and a flurry of research towards solving very large scale problems have emerged recently, such as Mcdonald et al. (2009); Zhang et al. (2013, 2015); Rosenblatt et al. (2016) and the references therein. In general, distributed algorithm can be classified into two families: data parallelism and task parallelism. Data parallelism aims at distributing the data across different parallel computing nodes or machines; and task parallelism distributes different tasks across parallel computing nodes. We are only concerned with data parallelism in this paper. In particular, we primarily consider the distributed estimation for partially linear models via using the standard divide and conquer strategy. Divide-and-conquer technology is a simple and communication-efficient way for handling big data, which is commonly used in the literature of statistical learning. To be precise, the whole data is randomly allocated among m machines, a local estimator is computed independently on each machine, and then the central node averages the local solutions into a global estimate. Partially linear models (PLM) (Hardle and Liang, 2007; Heckman, 1986), as the leading example of semiparametric models, are a class of important tools for modeling complex data, which retain model interpretation and flexibility simultaneously.

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Machine Learning

Aug-17-2017

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.28)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found