Cross-Cluster Weighted Forests
Ramchandran, Maya, Mukherjee, Rajarshi, Parmigiani, Giovanni
Datasets containing natural clusters or batch effects are ubiquitous across most biological applications, necessitating the advent of prediction algorithms that can adapt to the particular challenges of handling possible heterogeneity in the distribution of the features. Numerous learning algorithms have been created to address the setting in which the covariate-outcome relationship varies across clusters, including mixed-effects regression, sequential ensembling approaches, the mixture of experts framework, and dynamic co-clustering learning algorithms [1] [2] [3]. This context is in fact analogous to the multi-study framework formalized by Patil and Parmigiani (2018), in which separate clusters can be thought of as individual studies [4]. Multi-study learning handles the availability of multiple training studies that measure the same outcome and many of the same covariates by building ensembles of learners each trained on a single study to form the final predictor. Several learning algorithms have been shown to be highly effective in this scheme, including regularized regression, neural networks, and Random Forest.
May-17-2021
- Country:
- North America > United States
- New York > New York County
- New York City (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- New York > New York County
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report (1.00)
- Industry:
- Technology: