Nystrom Approximation for Sparse Kernel Methods: Theoretical Analysis and Empirical Evaluation
Xu, Zenglin (University of Electronic Science and Technology of China) | Jin, Rong (Michigan State University) | Shen, Bin (Purdue University) | Zhu, Shenghuo (Alibaba Group)
While if kernels are not Kernel methods (Schölkopf and Smola 2002; Xu et al. 2009) low rank, Nyström approximations can usually lead to suboptimal have received a lot of attention in recent studies of machine performances. To alleviate the strong assumption in learning. These methods project data into high-dimensional the seeking of the approximation bounds, we take a more or even infinite-dimensional spaces via kernel mapping general assumption that the design matrix K ensuring the restricted functions. Despite the strong generalization ability induced isometric property (Koltchinskii 2011). In particular, by kernel methods, they usually suffer from the high computation the new assumption obeys the restricted eigenvalue condition complexity of calculating the kernel matrix (also (Koltchinskii 2011; Bickel, Ritov, and Tsybakov 2009), called Gram matrix). Although low-rank decomposition which has been shown to be more general than several techniques(e.g., Cholesky Decomposition (Fine and Scheinberg other similar assumptions used in sparsity literature (Candes 2002; Bach and Jordan 2005)), and truncating methods(e.g., and Tao 2007; Donoho, Elad, and Temlyakov 2006; Kernel Tapering (Shen, Xu, and Allebach 2014; Zhang and Huang 2008). Based on the restricted eigenvalue Furrer, Genton, and Nychka 2006)) can accelerate the calculation condition, we have provided error bounds for kernel approximation of the kernel matrix, they still need to compute the and recovery rate in sparse kernel regression.
Mar-6-2015
- Country:
- Asia > Middle East
- Jordan (0.25)
- North America > United States (0.68)
- Asia > Middle East
- Genre:
- Research Report (0.48)
- Technology: