Cross-Cluster Weighted Forests

Ramchandran, Maya, Mukherjee, Rajarshi, Parmigiani, Giovanni

arXiv.org Machine Learning 

Datasets containing natural clusters or batch effects are ubiquitous across most biological applications, necessitating the advent of prediction algorithms that can adapt to the particular challenges of handling possible heterogeneity in the distribution of the features. Numerous learning algorithms have been created to address the setting in which the covariate-outcome relationship varies across clusters, including mixed-effects regression, sequential ensembling approaches, the mixture of experts framework, and dynamic co-clustering learning algorithms [1] [2] [3]. This context is in fact analogous to the multi-study framework formalized by Patil and Parmigiani (2018), in which separate clusters can be thought of as individual studies [4]. Multi-study learning handles the availability of multiple training studies that measure the same outcome and many of the same covariates by building ensembles of learners each trained on a single study to form the final predictor. Several learning algorithms have been shown to be highly effective in this scheme, including regularized regression, neural networks, and Random Forest.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found