Sparse GCA and Thresholded Gradient Descent

Jul-1-2021–arXiv.org Machine Learning

With the advent of big data acquisition technology, it has become increasingly important to integrate information across multiple datasets collected on a common set of subjects. Canonical correlation analysis (CCA), first proposed by Hotelling [20], is a widely used statistical tool to integrate information from two datasets: It seeks linear combinations of variables within each dataset such that their correlation is maximized. However, recent advances in fields such as multi-omics and multimodal brain imaging have presented us with new challenges, since scientists are often able to collect more than two datasets on the same set of subjects nowadays. To tackle these challenges, we turn to a useful generalization of CCA called generalized correlation analysis (GCA) [23] which aims to explore linear relationships across multiple data sources. Kettenring [23] proposed five different techniques for generalized correlation analysis of multiple datasets, where different methods correspond to maximization of different objective functions of covariances and correlations, subject to certain normalization constraints.

algorithm 1, eigenvalue, matrix, (15 more...)

arXiv.org Machine Learning

Jul-1-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Pennsylvania (0.04)
  - California > Santa Clara County
    - Palo Alto (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Workflow (0.46)
- Research Report (0.40)

Industry:
- Health & Medicine (0.74)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (0.93)
  - Machine Learning > Statistical Learning
    - Gradient Descent (0.41)