Goto

Collaborating Authors

Application of machine learning for hematological diagnosis

arXiv.org Machine Learning

Quick and accurate medical diagnosis is crucial for the successful treatment of a disease. Using machine learning algorithms, we have built two models to predict a hematologic disease, based on laboratory blood test results. In one predictive model, we used all available blood test parameters and in the other a reduced set, which is usually measured upon patient admittance. Both models produced good results, with a prediction accuracy of 0.88 and 0.86, when considering the list of five most probable diseases, and 0.59 and 0.57, when considering only the most probable disease. Models did not differ significantly from each other, which indicates that a reduced set of parameters contains a relevant fingerprint of a disease, expanding the utility of the model for general practitioner's use and indicating that there is more information in the blood test results than physicians recognize. In the clinical test we showed that the accuracy of our predictive models was on a par with the ability of hematology specialists. Our study is the first to show that a machine learning predictive model based on blood tests alone, can be successfully applied to predict hematologic diseases and could open up unprecedented possibilities in medical diagnosis.


The accuracy vs. coverage trade-off in patient-facing diagnosis models

arXiv.org Machine Learning

In these online tools, patients input their initial symptoms and then proceed to answer a series of questions that the system deems relevant to those symptoms. The output of these online tools is a differential diagnosis (ranked list of diseases) that helps educate patients on possible relevant health conditions. Online symptom checkers are powered by underlying diagnosis models or engines similar to those used for advising physicians in "clinical decision support tools"; the main difference in this scenario being that the resulting differential diagnosis is not directly shared with the patient, but rather used by a physician for professional evaluation. Diagnosis models must have high accuracy while covering a large space of symptoms and diseases to be useful to patients and physicians. Accuracy is critically important, as incorrect diagnoses can give patients unnecessary cause for concern.


Predicting the Severity of Breast Masses with Data Mining Methods

arXiv.org Machine Learning

Mammography is the most effective and available tool for breast cancer screening. However, the low positive predictive value of breast biopsy resulting from mammogram interpretation leads to approximately 70% unnecessary biopsies with benign outcomes. Data mining algorithms could be used to help physicians in their decisions to perform a breast biopsy on a suspicious lesion seen in a mammogram image or to perform a short term follow-up examination instead. In this research paper data mining classification algorithms; Decision Tree (DT), Artificial Neural Network (ANN), and Support Vector Machine (SVM) are analyzed on mammographic masses data set. The purpose of this study is to increase the ability of physicians to determine the severity (benign or malignant) of a mammographic mass lesion from BI-RADS attributes and the patient,s age. The whole data set is divided for training the models and test them by the ratio of 70:30% respectively and the performances of classification algorithms are compared through three statistical measures; sensitivity, specificity, and classification accuracy. Accuracy of DT, ANN and SVM are 78.12%, 80.56% and 81.25% of test samples respectively. Our analysis shows that out of these three classification models SVM predicts severity of breast cancer with least error rate and highest accuracy.


Using Deep Learning and Explainable Artificial Intelligence in Patients' Choices of Hospital Levels

arXiv.org Artificial Intelligence

In countries that enabled patients to choose their own providers, a common problem is that the patients did not make rational decisions, and hence, fail to use healthcare resources efficiently. This might cause problems such as overwhelming tertiary facilities with mild condition patients, thus limiting their capacity of treating acute and critical patients. To address such maldistributed patient volume, it is essential to oversee patients choices before further evaluation of a policy or resource allocation. This study used nationwide insurance data, accumulated possible features discussed in existing literature, and used a deep neural network to predict the patients choices of hospital levels. This study also used explainable artificial intelligence methods to interpret the contribution of features for the general public and individuals. In addition, we explored the effectiveness of changing data representations. The results showed that the model was able to predict with high area under the receiver operating characteristics curve (AUC) (0.90), accuracy (0.90), sensitivity (0.94), and specificity (0.97) with highly imbalanced label. Generally, social approval of the provider by the general public (positive or negative) and the number of practicing physicians serving per ten thousand people of the located area are listed as the top effecting features. The changing data representation had a positive effect on the prediction improvement. Deep learning methods can process highly imbalanced data and achieve high accuracy. The effecting features affect the general public and individuals differently. Addressing the sparsity and discrete nature of insurance data leads to better prediction. Applications using deep learning technology are promising in health policy making. More work is required to interpret models and practice implementation.


Short-term Mortality Prediction for Elderly Patients Using Medicare Claims Data

arXiv.org Machine Learning

Risk prediction is central to both clinical medicine and public health. While many machine learning models have been developed to predict mortality, they are rarely applied in the clinical literature, where classification tasks typically rely on logistic regression. One reason for this is that existing machine learning models often seek to optimize predictions by incorporating features that are not present in the databases readily available to providers and policy makers, limiting generalizability and implementation. Here we tested a number of machine learning classifiers for prediction of six-month mortality in a population of elderly Medicare beneficiaries, using an administrative claims database of the kind available to the majority of health care payers and providers. We show that machine learning classifiers substantially outperform current widely-used methods of risk prediction but only when used with an improved feature set incorporating insights from clinical medicine, developed for this study. Our work has applications to supporting patient and provider decision making at the end of life, as well as population health-oriented efforts to identify patients at high risk of poor outcomes.