Scalable prediction of acute myeloid leukemia using high-dimensional machine learning and blood transcriptomics

#artificialintelligence 

Acute Myeloid Leukemia (AML) is a severe, mostly fatal hematopoietic malignancy. We were interested whether transcriptomic-based machine learning could predict AML status without requiring expert input. Using 12,029 samples from 105 different studies, we present a large-scale study of machine learning-based prediction of AML in which we address key questions relating to the combination of machine learning and transcriptomics and their practical use. We find data-driven, high-dimensional approaches – in which multivariate signatures are learned directly from genome-wide data with no prior knowledge – to be accurate and robust. Importantly, these approaches are highly scalable with low marginal cost, essentially matching human expert annotation in a near-automated workflow.