To assess if semisupervised natural language processing (NLP) of text clinical radiology reports could provide useful automated diagnosis categorization for ground truth labeling to overcome manual labeling bottlenecks in the machine learning pipeline. In this retrospective study, 1503 text cardiac MRI reports (from between 2016 and 2019) were manually annotated for five diagnoses by clinicians: normal, dilated cardiomyopathy (DCM), hypertrophic cardiomyopathy (HCM), myocardial infarction (MI), and myocarditis.