The Directional Bias Helps Stochastic Gradient Descent to Generalize in Kernel Regression Models
Luo, Yiling, Huo, Xiaoming, Mei, Yajun
–arXiv.org Artificial Intelligence
The Stochastic Gradient Descent (SGD) is a popular optimization algorithm that has a wide range of applications, including generalized linear model in statistics and deep Neural Network in machine learning. One main advantage of the SGD is the computational scalability due to low cost per iteration. Recent work also indicates that the SGD might also lead to outcomes that possess nice statistical properties under the linear regression framework, see [19]. In this paper, we study the statistical properties of the SGD under nonparametric regression models. We focus on the Reproducing Kernel Hilbert Space (RKHS) model, which is popular in both statistics and machine learning communities and is often simply referred to as the "kernel trick," see [2, 27].
arXiv.org Artificial Intelligence
Apr-29-2022
- Country:
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Genre:
- Research Report > New Finding (0.68)
- Technology: