Babak Ehteshami Bejnordi, from the Radboud University Medical Center in Nijmegen, Netherlands, and colleagues compared the performance of automated deep learning algorithms for detecting metastases in hematoxylin and eosin-stained tissue sections of lymph nodes of women with breast cancer with pathologists' diagnoses in a diagnostic setting. The researchers found that the area under the receiver operating characteristic curve (AUC) ranged from 0.556 to 0.994 for the algorithms. The lesion-level, true-positive fraction achieved for the top-performing algorithm was comparable to that of the pathologist without a time constraint at a mean of 0.0125 false-positives per normal whole-slide image. Daniel Shu Wei Ting, M.D., Ph.D., from the Singapore National Eye Center, and colleagues assessed the performance of a DLS for detecting referable diabetic retinopathy and related eye diseases using 494,661 retinal images. The researchers found that the AUC of the DLS for referable diabetic retinopathy was 0.936, and sensitivity and specificity were 90.5 and 91.6 percent, respectively.
The Centers for Disease Control and Prevention (CDC) coordinates a labor-intensive process to measure the prevalence of autism spectrum disorder (ASD) among children in the United States. Random forests methods have shown promise in speeding up this process, but they lag behind human classification accuracy by about 5 percent. We explore whether newer document classification algorithms can close this gap. We applied 6 supervised learning algorithms to predict whether children meet the case definition for ASD based solely on the words in their evaluations. We compared the algorithms? performance across 10 random train-test splits of the data, and then, we combined our top 3 classifiers to estimate the Bayes error rate in the data. Across the 10 train-test cycles, the random forest, neural network, and support vector machine with Naive Bayes features (NB-SVM) each achieved slightly more than 86.5 percent mean accuracy. The Bayes error rate is estimated at approximately 12 percent meaning that the model error for even the simplest of our algorithms, the random forest, is below 2 percent. NB-SVM produced significantly more false positives than false negatives. The random forest performed as well as newer models like the NB-SVM and the neural network. NB-SVM may not be a good candidate for use in a fully-automated surveillance workflow due to increased false positives. More sophisticated algorithms, like hierarchical convolutional neural networks, would not perform substantially better due to characteristics of the data. Deep learning models performed similarly to traditional machine learning methods at predicting the clinician-assigned case status for CDC's autism surveillance system. While deep learning methods had limited benefit in this task, they may have applications in other surveillance systems.
In this paper, we present an experimental study for the classification of perceived human stress using non-invasive physiological signals. These include electroencephalography (EEG), galvanic skin response (GSR), and photoplethysmography (PPG). We conducted experiments consisting of steps including data acquisition, feature extraction, and perceived human stress classification. The physiological data of $28$ participants are acquired in an open eye condition for a duration of three minutes. Four different features are extracted in time domain from EEG, GSR and PPG signals and classification is performed using multiple classifiers including support vector machine, the Naive Bayes, and multi-layer perceptron (MLP). The best classification accuracy of 75% is achieved by using MLP classifier. Our experimental results have shown that our proposed scheme outperforms existing perceived stress classification methods, where no stress inducers are used.
Background: Palliative care is referred to a set of programs for patients that suffer life-limiting illnesses. These programs aim to guarantee a minimum level of quality of life (QoL) for the last stage of life. They are currently based on clinical evaluation of risk of one-year mortality. Objectives: The main objective of this work is to develop and validate machine-learning based models to predict the exitus of a patient within the next year using data gathered at hospital admission. Methods: Five machine learning techniques were applied in our study to develop machine-learning predictive models: Support Vector Machines, K-neighbors Classifier, Gradient Boosting Classifier, Random Forest and Multilayer Perceptron. All models were trained and evaluated using the retrospective dataset. The evaluation was performed with five metrics computed by a resampling strategy: Accuracy, the area under the ROC curve, Specificity, Sensitivity, and the Balanced Error Rate. Results: All models for forecasting one-year mortality achieved an AUC ROC from 0.858 to 0.911. Specifically, Gradient Boosting Classifier was the best model, producing an AUC ROC of 0.911 (CI 95%, 0.911 to 0.912), a sensitivity of 0.858 (CI 95%, 0.856 to 0.86) and a specificity of 0.807 (CI 95%, 0.806 to 0808) and a BER of 0.168 (CI 95%, 0.167 to 0.169). Conclusions: The analysis of common information at hospital admission combined with machine learning techniques produced models with competitive discriminative power. Our models reach the best results reported in state of the art. These results demonstrate that they can be used as an accurate data-driven palliative care criteria inclusion.
In this work, we present a comparison of a shallow and a deep learning architecture for the automated segmentation of white matter lesions in MR images of multiple sclerosis patients. In particular, we train and test both methods on early stage disease patients, to verify their performance in challenging conditions, more similar to a clinical setting than what is typically provided in multiple sclerosis segmentation challenges. Furthermore, we evaluate a prototype naive combination of the two methods, which refines the final segmentation. All methods were trained on 32 patients, and the evaluation was performed on a pure test set of 73 cases. Results show low lesion-wise false positives (30%) for the deep learning architecture, whereas the shallow architecture yields the best Dice coefficient (63%) and volume difference (19%). Combining both shallow and deep architectures further improves the lesion-wise metrics (69% and 26% lesion-wise true and false positive rate, respectively).