Goto

Collaborating Authors

 cumulative incidence


New-Onset Diabetes Assessment Using Artificial Intelligence-Enhanced Electrocardiography

Jethani, Neil, Puli, Aahlad, Zhang, Hao, Garber, Leonid, Jankelson, Lior, Aphinyanaphongs, Yindalon, Ranganath, Rajesh

arXiv.org Artificial Intelligence

Undiagnosed diabetes is present in 21.4% of adults with diabetes. Diabetes can remain asymptomatic and undetected due to limitations in screening rates. To address this issue, questionnaires, such as the American Diabetes Association (ADA) Risk test, have been recommended for use by physicians and the public. Based on evidence that blood glucose concentration can affect cardiac electrophysiology, we hypothesized that an artificial intelligence (AI)-enhanced electrocardiogram (ECG) could identify adults with new-onset diabetes. We trained a neural network to estimate HbA1c using a 12-lead ECG and readily available demographics. We retrospectively assembled a dataset comprised of patients with paired ECG and HbA1c data. The population of patients who receive both an ECG and HbA1c may a biased sample of the complete outpatient population, so we adjusted the importance placed on each patient to generate a more representative pseudo-population. We found ECG-based assessment outperforms the ADA Risk test, achieving a higher area under the curve (0.80 vs. 0.68) and positive predictive value (13% vs. 9%) -- 2.6 times the prevalence of diabetes in the cohort. The AI-enhanced ECG significantly outperforms electrophysiologist interpretation of the ECG, suggesting that the task is beyond current clinical capabilities. Given the prevalence of ECGs in clinics and via wearable devices, such a tool would make precise, automated diabetes assessment widely accessible.


Adaptive Sequential Surveillance with Network and Temporal Dependence

Malenica, Ivana, Coyle, Jeremy R., van der Laan, Mark J., Petersen, Maya L.

arXiv.org Machine Learning

Strategic test allocation plays a major role in the control of both emerging and existing pandemics (e.g., COVID-19, HIV). Widespread testing supports effective epidemic control by (1) reducing transmission via identifying cases, and (2) tracking outbreak dynamics to inform targeted interventions. However, infectious disease surveillance presents unique statistical challenges. For instance, the true outcome of interest - one's positive infectious status, is often a latent variable. In addition, presence of both network and temporal dependence reduces the data to a single observation. As testing entire populations regularly is neither efficient nor feasible, standard approaches to testing recommend simple rule-based testing strategies (e.g., symptom based, contact tracing), without taking into account individual risk. In this work, we study an adaptive sequential design involving n individuals over a period of {\tau} time-steps, which allows for unspecified dependence among individuals and across time. Our causal target parameter is the mean latent outcome we would have obtained after one time-step, if, starting at time t given the observed past, we had carried out a stochastic intervention that maximizes the outcome under a resource constraint. We propose an Online Super Learner for adaptive sequential surveillance that learns the optimal choice of tests strategies over time while adapting to the current state of the outbreak. Relying on a series of working models, the proposed method learns across samples, through time, or both: based on the underlying (unknown) structure in the data. We present an identification result for the latent outcome in terms of the observed data, and demonstrate the superior performance of the proposed strategy in a simulation modeling a residential university environment during the COVID-19 pandemic.


Differentially Private Survival Function Estimation

Gondara, Lovedeep, Wang, Ke

arXiv.org Machine Learning

Survival function estimation is used in many disciplines, but it is most common in medical analytics in the form of the Kaplan-Meier estimator. Sensitive data (patient records) is used in the estimation without any explicit control on the information leakage, which is a significant privacy concern. We propose a first differentially private estimator of the survival function and show that it can be easily extended to provide differentially private confidence intervals and test statistics without spending any extra privacy budget. We further provide extensions for differentially private estimation of the competing risk cumulative incidence function. Using nine real-life clinical datasets, we provide empirical evidence that our proposed method provides good utility while simultaneously providing strong privacy guarantees.


Ensemble Prediction of Time to Event Outcomes with Competing Risks: A Case Study of Surgical Complications in Crohn's Disease

Sachs, Michael C, Discacciati, Andrea, Everhov, Åsa, Olén, Ola, Gabriel, Erin E

arXiv.org Machine Learning

Motivating study and statistical approaches Crohn's disease (CD) is a chronic debilitating condition characterized by periods of inflammatory activity in the bowel that causes symptoms such as abdominal pain, diarrhea, andweight loss. Pharmacologic treatment for CD includes medications such as steroids, immunomodulating drugs, and biological therapy. Despite these available medications, many people with CD are escalated to surgical interventions from small to extensive resections of the bowel or colon (Gomollón et al., 2016). Previous studies have estimated that up to 50% of patients with CD undergo surgery within 10 years after diagnosis; however, surgical rates have decreased over time, possibly due to the introduction of modern treatments such as thiopurines and anti-TNF (Lakatos et al., 2012; Ramadas et al., 2010). The aim of this study is to determine whether clinical and demographic characteristics observed at the time of diagnosis can be used to predict the occurrence of major abdominal surgery within 5 years, with the goal of personalized disease management.