How Many Machines Can We Use in Parallel Computing for Kernel Ridge Regression?

Liu, Meimei, Shang, Zuofeng, Cheng, Guang

May-24-2018–arXiv.org Machine Learning

This paper attempts to solve a basic problem in distributed statistical inference: how many machines can we use in parallel computing? In kernel ridge regression, we address this question in two important settings: nonparametric estimation and hypothesis testing. Specifically, we find a range for the number of machines under which optimal estimation/testing is achievable. The employed empirical processes method provides a unified framework, that allows us to handle various regression problems (such as thin-plate splines and nonparametric additive regression) under different settings (such as univariate, multivariate and diverging-dimensional designs). It is worth noting that the upper bounds of the number of machines are proven to be un-improvable (up to a logarithmic factor) in two important cases: smoothing spline regression and Gaussian RKHS regression. Our theoretical findings are backed by thorough numerical studies.

artificial intelligence, machine learning, theorem 3, (17 more...)

arXiv.org Machine Learning

May-24-2018

arXiv.org PDF

Add feedback

Country:
- North America > United States > Indiana (0.46)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Performance Analysis > Accuracy (0.61)
  - Statistical Learning > Regression (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found