Artificial intelligence will play a pivotal role in the future of health care, medical experts say, but so far, the industry has been unable to fully leverage this tool. A Yale study has illuminated the limitations of these analytics when applied to traditional medical databases -- suggesting that the key to unlocking their value may be in the way datasets are prepared. Machine learning techniques are well-suited for processing complex, high-dimensional data or identifying nonlinear patterns, which provide researchers and clinicians with a framework to generate new insights. But the study suggests that achieving the potential of artificial intelligence will require improving the data quality of electronic health records (EHR). "Our study found that advanced methods that have revolutionized predictions outside healthcare did not meaningfully improve prediction of mortality in a large national registry. These registries that rely on manually abstracted data within a restricted number of fields may, therefore, not be capturing many patient features that have implications for their outcomes," said Rohan Khera, MD, MS, the first author of the new study published in JAMA Cardiology.
The wide adoption of Electronic Health Records (EHR) has resulted in large amounts of clinical data becoming available, which promises to support service delivery and advance clinical and informatics research. Deep learning techniques have demonstrated performance in predictive analytic tasks using EHRs yet they typically lack model result transparency or explainability functionalities and require cumbersome pre-processing tasks. Moreover, EHRs contain heterogeneous and multi-modal data points such as text, numbers and time series which further hinder visualisation and interpretability. This paper proposes a deep learning framework to: 1) encode patient pathways from EHRs into images, 2) highlight important events within pathway images, and 3) enable more complex predictions with additional intelligibility. The proposed method relies on a deep attention mechanism for visualisation of the predictions and allows predicting multiple sequential outcomes.
For example, doctors often diagnose angina and heart attacks based on symptoms that men experience more commonly than women. Women are consequently underdiagnosed for heart disease. An algorithm designed to help doctors detect cardiac conditions that is trained on historical diagnostic data could learn to focus on men's symptoms and not on women's, which would exacerbate the problem of underdiagnosing women. Also, AI discrimination can be rooted in erroneous assumptions, as in the case of the high-risk care program algorithm. In another instance, electronic health records software company Epic built an AI-based tool to help medical offices identify patients who are likely to miss appointments.
Artificial intelligence will play a pivotal role in the future of health care, medical experts say, but so far, the industry has been unable to fully leverage this tool. A Yale study has illuminated the limitations of these analytics when applied to traditional medical databases -- suggesting that the key to unlocking their value may be in the way datasets are prepared. Machine learning techniques are well-suited for processing complex, high-dimensional data or identifying nonlinear patterns, which provide researchers and clinicians with a framework to generate new insights. Achieving the potential of artificial intelligence will require improving the data quality of electronic health records (EHR). "Our study found that advanced methods that have revolutionized predictions outside healthcare did not meaningfully improve prediction of mortality in a large national registry. These registries that rely on manually abstracted data within a restricted number of fields may, therefore, not be capturing many patient features that have implications for their outcomes," said Rohan Khera, MD, MS, the first author of the new study published in JAMA Cardiology.
Medical systems in general, and patient treatment decisions and outcomes in particular, are affected by bias based on gender and other demographic elements. As language models are increasingly applied to medicine, there is a growing interest in building algorithmic fairness into processes impacting patient care. Much of the work addressing this question has focused on biases encoded in language models -- statistical estimates of the relationships between concepts derived from distant reading of corpora. Building on this work, we investigate how word choices made by healthcare practitioners and language models interact with regards to bias. We identify and remove gendered language from two clinical-note datasets and describe a new debiasing procedure using BERT-based gender classifiers. We show minimal degradation in health condition classification tasks for low- to medium-levels of bias removal via data augmentation. Finally, we compare the bias semantically encoded in the language models with the bias empirically observed in health records. This work outlines an interpretable approach for using data augmentation to identify and reduce the potential for bias in natural language processing pipelines.
Recent years have witnessed the rapid accumulation of massive electronic medical records (EMRs), which highly support the intelligent medical services such as drug recommendation. However, prior arts mainly follow the traditional recommendation strategies like collaborative filtering, which usually treat individual drugs as mutually independent, while the latent interactions among drugs, e.g., synergistic or antagonistic effect, have been largely ignored. To that end, in this paper, we target at developing a new paradigm for drug package recommendation with considering the interaction effect within drugs, in which the interaction effects could be affected by patient conditions. Specifically, we first design a pre-training method based on neural collaborative filtering to get the initial embedding of patients and drugs. Then, the drug interaction graph will be initialized based on medical records and domain knowledge. Along this line, we propose a new Drug Package Recommendation (DPR) framework with two variants, respectively DPR on Weighted Graph (DPR-WG) and DPR on Attributed Graph (DPR-AG) to solve the problem, in which each the interactions will be described as signed weights or attribute vectors. In detail, a mask layer is utilized to capture the impact of patient condition, and graph neural networks (GNNs) are leveraged for the final graph induction task to embed the package. Extensive experiments on a real-world data set from a first-rate hospital demonstrate the effectiveness of our DPR framework compared with several competitive baseline methods, and further support the heuristic study for the drug package generation task with adequate performance.
Artificial intelligence (AI) technology has been increasingly used in the implementation of advanced Clinical Decision Support Systems (CDSS). Research demonstrated the potential usefulness of AI-powered CDSS (AI-CDSS) in clinical decision making scenarios. However, post-adoption user perception and experience remain understudied, especially in developing countries. Through observations and interviews with 22 clinicians from 6 rural clinics in China, this paper reports the various tensions between the design of an AI-CDSS system ("Brilliant Doctor") and the rural clinical context, such as the misalignment with local context and workflow, the technical limitations and usability barriers, as well as issues related to transparency and trustworthiness of AI-CDSS. Despite these tensions, all participants expressed positive attitudes toward the future of AI-CDSS, especially acting as "a doctor's AI assistant" to realize a Human-AI Collaboration future in clinical settings. Finally we draw on our findings to discuss implications for designing AI-CDSS interventions for rural clinical contexts in developing countries.
Despite the explosion of interest in healthcare AI research, the reproducibility and benchmarking of those research works are often limited due to the lack of standard benchmark datasets and diverse evaluation metrics. To address this reproducibility challenge, we develop PyHealth, an open-source Python toolbox for developing various predictive models on healthcare data. PyHealth consists of data preprocessing module, predictive modeling module, and evaluation module. The target users of PyHealth are both computer science researchers and healthcare data scientists. With PyHealth, they can conduct complex machine learning pipelines on healthcare datasets with fewer than ten lines of code. The data preprocessing module enables the transformation of complex healthcare datasets such as longitudinal electronic health records, medical images, continuous signals (e.g., electrocardiogram), and clinical notes into machine learning friendly formats. The predictive modeling module provides more than 30 machine learning models, including established ensemble trees and deep neural network-based approaches, via a unified but extendable API designed for both researchers and practitioners. The evaluation module provides various evaluation strategies (e.g., cross-validation and train-validation-test split) and predictive model metrics. With robustness and scalability in mind, best practices such as unit testing, continuous integration, code coverage, and interactive examples are introduced in the library's development.
Deep learning models have demonstrated superior performance in several application problems, such as image classification and speech processing. However, creating a deep learning model using health record data requires addressing certain privacy challenges that bring unique concerns to researchers working in this domain. One effective way to handle such private data issues is to generate realistic synthetic data that can provide practically acceptable data quality and correspondingly the model performance. To tackle this challenge, we develop a differentially private framework for synthetic data generation using R\'enyi differential privacy. Our approach builds on convolutional autoencoders and convolutional generative adversarial networks to preserve some of the critical characteristics of the generated synthetic data. In addition, our model can also capture the temporal information and feature correlations that might be present in the original data. We demonstrate that our model outperforms existing state-of-the-art models under the same privacy budget using several publicly available benchmark medical datasets in both supervised and unsupervised settings.
Background and Objectives: Clinical Practice Guidelines (CPGs) represent the foremost methodology for sharing state-of-the-art research findings in the healthcare domain with medical practitioners to limit practice variations, reduce clinical cost, improve the quality of care, and provide evidence based treatment. However, extracting relevant knowledge from the plethora of CPGs is not feasible for already burdened healthcare professionals, leading to large gaps between clinical findings and real practices. It is therefore imperative that state-of-the-art Computing research, especially machine learning is used to provide artificial intelligence based solution for extracting the knowledge from CPGs and reducing the gap between healthcare research/guidelines and practice. Methods: This research presents a novel methodology for knowledge extraction from CPGs to reduce the gap and turn the latest research findings into clinical practice. First, our system classifies the CPG sentences into four classes such as condition-action, condition-consequences, action, and not-applicable based on the information presented in a sentence. We use deep learning with state-of-the-art word embedding, improved word vectors technique in classification process. Second, it identifies qualifier terms in the classified sentences, which assist in recognizing the condition and action phrases in a sentence. Finally, the condition and action phrase are processed and transformed into plain rule If Condition(s) Then Action format. Results: We evaluate the methodology on three different domains guidelines including Hypertension, Rhinosinusitis, and Asthma. The deep learning model classifies the CPG sentences with an accuracy of 95%. While rule extraction was validated by user-centric approach, which achieved a Jaccard coefficient of 0.6, 0.7, and 0.4 with three human experts extracted rules, respectively.