Cross-Modal Data Programming Enables Rapid Medical Machine Learning

Dunnmon, Jared, Ratner, Alexander, Khandwala, Nishith, Saab, Khaled, Markert, Matthew, Sagreiya, Hersh, Goldman, Roger, Lee-Messer, Christopher, Lungren, Matthew, Rubin, Daniel, Ré, Christopher

arXiv.org Machine Learning 

Department of Biomedical Data Science, Stanford University, Stanford, California, USA Labeling training datasets has become a key barrier to building medical machine learning models. One strategy is to generate training labels programmatically, for example by applying natural language processing pipelines to text reports associated with imaging studies. We propose cross-modal data programming, which generalizes this intuitive strategy in a theoretically-grounded way that enables simpler, clinician-driven input, reduces required labeling time, and improves with additional unlabeled data. In this approach, clinicians generate training labels for models defined over a target modality (e.g. The resulting technical challenge consists of estimating the accuracies and correlations of these rules; we extend a recent unsupervised generative modeling technique to handle this cross-modal setting in a provably consistent way. Across four applications in radiography, computed tomography, and electroencephalography, and using only several hours of clinician time, our approach matches or exceeds the efficacy of physician-months of hand-labeling with statistical significance, demonstrating a fundamentally faster and more flexible way of building machine learning models in medicine. In addition to being extremely costly, these training sets are inflexible: given a new classification schema, imaging system, patient population, or other change in the data distribution or modeling task, the training set generally needs to be relabeled from scratch. One manifestation of this shift in the broader machine learning community is the increasing use of weak supervision approaches, where training data is labeled in noisier, higher-level, often programmatic ways, rather than manually by experts. We broadly characterize these methods as cross-modal weak supervision approaches, in which the strategy is to programmatically extract labels from an auxiliary modality--e.g. the unstructured text reports accompanying an imaging study--which are then used as training labels for a model defined over the target modality, e.g. These methods follow the intuition that programmatically extracting labels from the auxiliary modality can be far faster and easier than hand-labeling or deriving labels from the target modality directly.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found