Prediction Focused Topic Models for Electronic Health Records
Ren, Jason, Kunes, Russell, Doshi-Velez, Finale
Electronic Health Record (EHR) data can be represented as di screte counts over a high dimensional set of possible procedures, diagnoses, a nd medications. Supervised topic models present an attractive option for inco rporating EHR data as features into a prediction problem: given a patient's recor d, we estimate a set of latent factors that are predictive of the response variab le. However, existing methods for supervised topic modeling struggle to balance p rediction quality and coherence of the latent factors. W e introduce a novel approa ch, the prediction-focused topic model, that uses the supervisory signal to ret ain only features that improve, or do not hinder, prediction performance. By remov ing features with irrelevant signal, the topic model is able to learn task-relev ant, interpretable topics. W e demonstrate on a EHR dataset and a movie review dataset tha t compared to existing approaches, prediction-focused topic models are able to learn much more coherent topics while maintaining competitive prediction s.
Nov-15-2019
- Genre:
- Research Report (0.41)
- Industry:
- Health & Medicine
- Therapeutic Area > Neurology (1.00)
- Health Care Technology > Medical Record (1.00)
- Health & Medicine
- Technology: