On Bayesian sparse canonical correlation analysis via Rayleigh quotient framework
Canonical correlation analysis is a statistical technique -dating back at least to [1] - that is used to maximally correlate multiple datasets for joint analysis. The technique has become a fundamental tool in biomedical research where technological advances have led to a huge number of multi-omic datasets ([2]; [3]; [4]). Over the past two decades, limited sample sizes, growing dimensionality, and the search for meaningful biological interpretations, have led to the development of sparse canonical correlation analysis ([2]), where a sparsity assumption is imposed on the canonical correlation vectors. This work falls under the topic of the Bayesian estimation of sparse canonical corrlation vectors. Model-based approaches to canonical correlation analysis were developed in the mid 2000's (see e.g., [5]), and paved the way for a Bayesian treatment of canonical correlation analysis ([6];[7]) and sparse canonical correlation analysis ([8]). However an serious shortcoming of such a Bayesian treatment is that this approach naturally requires a complete specification of the joint distribution of the data, so as to specify the likelihood function. This requirement is a serious limitation in many applications, where the data generating process is poorly understood, for example, image data.
Oct-16-2020