timehr
TimEHR: Image-based Time Series Generation for Electronic Health Records
Karami, Hojjat, Hartley, Mary-Anne, Atienza, David, Ionescu, Anisoara
Electronic health records (EHRs) chart patients' interactions with the health system and contain critical information for improving services and supporting research. Data from these systems are routinely incorporated into machine learning and statistical models for clinical decision support on diagnostic and prognostic predictions, as well as for monitoring health and evaluating treatment response [1]. However, access to large-scale EHR datasets is challenging and governed by strict regulations on privacy and security (e.g. HIPAA and GDPR), meaning that many models are based on unicentric data with a high risk of poor generalizability [2]. Traditional approaches for anonymization can be complex and costly, often compromising the data's statistical integrity and failing to provide robust privacy guarantees [3, 4]. The use of synthetic data is thus emerging as a promising solution for optimizing the trade-off between privacy and statistical utility [5, 6]. Generative models, particularly Generative Adversarial Networks (GANs) [7], have shown great potential in producing distribution-preserving synthetic EHR data.