Prediction approaches for partly missing multi-omics covariate data: A literature review and an empirical comparison study

Hornung, Roman, Ludwigs, Frederik, Hagenberg, Jonas, Boulesteix, Anne-Laure

arXiv.org Artificial Intelligence 

The generation of various types of omics data is becoming increasingly rapid and cost-effective. As a consequence, there are more so-called multi-omics data becoming available, that is, high-dimensional molecular data of several types such as genomic, transcriptomic, or proteomic data measured for the same patients. In the last few years, several approaches to use these data for patient outcome prediction have been developed (see Hornung and Wright (2019) for an extensive literature review). Nevertheless, doubts have recently emerged as to whether there is benefit to using multi-omics data over simple clinical models (Herrmann et al., 2020). Regardless of their usefulness for prediction, multi-omics data from different sources that are used for the same prediction problem, for various reasons, often do not feature the exact same types of data. Most importantly, the data for which predictions should be obtained, that is, the test data, often do not contain the same data types as the data available for obtaining the prediction rule, that is, the training data (Krautenbacher et al., 2019). The training data is also frequently composed of subsets originating from different sources (e.g.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found