Cross-Cluster Weighted Forests

Ramchandran, Maya, Mukherjee, Rajarshi, Parmigiani, Giovanni

May-17-2021–arXiv.org Machine Learning

Datasets containing natural clusters or batch effects are ubiquitous across most biological applications, necessitating the advent of prediction algorithms that can adapt to the particular challenges of handling possible heterogeneity in the distribution of the features. Numerous learning algorithms have been created to address the setting in which the covariate-outcome relationship varies across clusters, including mixed-effects regression, sequential ensembling approaches, the mixture of experts framework, and dynamic co-clustering learning algorithms [1] [2] [3]. This context is in fact analogous to the multi-study framework formalized by Patil and Parmigiani (2018), in which separate clusters can be thought of as individual studies [4]. Multi-study learning handles the availability of multiple training studies that measure the same outcome and many of the same covariates by building ensembles of learners each trained on a single study to form the final predictor. Several learning algorithms have been shown to be highly effective in this scheme, including regularized regression, neural networks, and Random Forest.

algorithm, dataset, ensemble, (16 more...)

arXiv.org Machine Learning

May-17-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - Massachusetts > Suffolk County
    - Boston (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine
  - Therapeutic Area > Oncology (0.68)
  - Pharmaceuticals & Biotechnology (0.47)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning > Clustering (1.00)
  - Neural Networks (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found