Graph Canonical Correlation Analysis

Park, Hongju, Bai, Shuyang, Ye, Zhenyao, Lee, Hwiyoung, Ma, Tianzhou, Chen, Shuo

arXiv.org Machine Learning 

CCA considers the following maximization problem: max a,b(a X Y b) subject to a X X a 1 and b Y Y b 1, where the vectors a and b and the correlation are said to be canonical vectors and canonical correlation if they attain the above maximization. In the classical canonical correlation analysis, the canonical vectors a and b include nonzero loadings for all X and Y variables. However, in a high-dimensional setting with p, q n, the goal is to identify which subsets of X are associated with subsets Y and estimate the measure of associations, as the canonical correlation with the full dataset is overly high due to estimation bias caused by overfitting. To ensure the sparsity, shrinkage methods 4 Biometrics, 000 0000 are commonly used. For example, Witten et al. (2009) propose sparse canonical correlation analysis (sCCA). The criterion of sCCA can be in general expressed as follows: max a,b a X Y b subject to a X X a 1, b Y Y b 1, P 1( a) k 1, P 2( b) k 2, where P 1 and P 2 are convex penalty functions for penalization for a and b with positive constants k 1 and k 2, respectively. A representative penalty function is a ℓ 1 penalty function such that P 1(a) = a 1 and P 2(b) = b 1. sCCA imposes zero loadings in canonical vectors and thus only selects subsets of correlated X and Y . However, sCCA methods may neither fully recover correlated X and Y pairs nor capture the multivariate-to-multivariate linkage patterns (see Figure 3) because the ℓ 1 shrinkage tends to select only a small subset from the associated variables of X and Y .