Cooperative learning for multi-view analysis
Ding, Daisy Yi, Narasimhan, Balasubramanian, Tibshirani, Robert
With new technologies in biomedicine, we are able to generate and collect data of various modalities, including genomics, epigenomics, transcriptomics, and proteomics (Figure 1A). Integrating heterogeneous features on a single set of observations provides a unique opportunity to gain a comprehensive understanding of an outcome of interest. It offers the potential for making discoveries that are hidden in data analyses of a single modality and achieving more accurate predictions of the outcome (Kristensen et al. 2014, Ritchie et al. 2015, Gligorijević et al. 2016, Karczewski & Snyder 2018, Ma et al. 2020). While "multi-view data analysis" can mean different things, we use it here in the context of supervised learning, where the goal is to fuse different data views to model an outcome of interest. To give a concrete example, assume that a researcher wants to predict cancer outcomes from RNA expression and DNA methylation measurements for a set of patients. The researcher suspects that: (1) both data views could potentially have prognostic value; (2) the two views share some underlying relationship with each other, as DNA methylation regulates gene expression and can repress the expression of tumor suppressor genes or promote the expression of oncogenes. Should the researcher use both data views for downstream prediction, or just use one view or the other?
Jan-6-2022
- Country:
- North America > United States (0.46)
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.92)
- Research Report
- Industry:
- Technology: