unsupervised random forest
Reviews: When do random forests fail?
This is a fairly serious omission, and casual readers would remember the wrong conclusions. This must be fixed for publication, but I think it would be straightforward to fix. Officially, NIPS reviewers are not required to look at the supplementary material. Because of having only three weeks to review six manuscripts, I was not able to make the time during my reviewing. So I worry that publishing this work would mean publishing results without sufficient peer review. DETAILED COMMENTS * p. 1: I'm not sure it is accurate to say that deep, unsupervised trees grown with no subsampling is a common setup for learning random forests. It appears in Geurts et al. (2006) as a special case, sometimes in mass estimation [1, 2], and sometimes in Wei Fan's random decision tree papers [3-6]. I don't think these are used very much.
Federated unsupervised random forest for privacy-preserving patient stratification
Pfeifer, Bastian, Sirocchi, Christel, Bloice, Marcus D., Kreuzthaler, Markus, Urschler, Martin
In the realm of precision medicine, effective patient stratification and disease subtyping demand innovative methodologies tailored for multi-omics data. Clustering techniques applied to multi-omics data have become instrumental in identifying distinct subgroups of patients, enabling a finer-grained understanding of disease variability. This work establishes a powerful framework for advancing precision medicine through unsupervised random-forest-based clustering and federated computing. We introduce a novel multi-omics clustering approach utilizing unsupervised random-forests. The unsupervised nature of the random forest enables the determination of cluster-specific feature importance, unraveling key molecular contributors to distinct patient groups. Moreover, our methodology is designed for federated execution, a crucial aspect in the medical domain where privacy concerns are paramount. We have validated our approach on machine learning benchmark data sets as well as on cancer data from The Cancer Genome Atlas (TCGA). Our method is competitive with the state-of-the-art in terms of disease subtyping, but at the same time substantially improves the cluster interpretability. Experiments indicate that local clustering performance can be improved through federated computing.
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)