A General Scoring Rule for Randomized Kernel Approximation with Application to Canonical Correlation Analysis

Oct-11-2019–arXiv.org Machine Learning

Random features has been widely used for kernel approximation in large-scale machine learning. A number of recent studies have explored data-dependent sampling of features, modifying the stochastic oracle from which random features are sampled. While proposed techniques in this realm improve the approximation, their application is limited to a specific learning task. In this paper, we propose a general scoring rule for sampling random features, which can be employed for various applications with some adjustments. We first observe that our method can recover a number of data-dependent sampling methods (e.g., leverage scores and energy-based sampling). Then, we restrict our attention to a ubiquitous problem in statistics and machine learning, namely Canonical Correlation Analysis (CCA). We provide a principled guide for finding the distribution maximizing the canonical correlations, resulting in a novel data-dependent method for sampling features. Numerical experiments verify that our algorithm consistently outperforms other sampling techniques in the CCA task.

artificial intelligence, machine learning, random feature, (14 more...)

arXiv.org Machine Learning

Oct-11-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States > Texas (0.14)

Genre:
- Research Report (1.00)

Industry:
- Leisure & Entertainment > Games (0.63)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found