Accuracy
Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation
Background: COVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking. Objective: The aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points. Methods: We used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from five hospitals in New York City for 4098 COVID-19โpositive patients admitted from March 15 to May 22, 2020. The models were first trained on patients from a single hospital (n 1514) before or on May 1, externally validated on patients from four other hospitals (n 2201) before or on May 1, and prospectively validated on all patients after May 1 (n 383). Finally, we established model interpretability to identify and rank variables that drive model predictions. Results: Upon cross-validation, the XGBoost classifier outperformed baseline models, with an area under the receiver operating characteristic curve (AUC-ROC) for mortality of 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days. XGBoost also performed well for critical event prediction, with an AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, the unimputed XGBoost model achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days.
Using Machine Learning for Decreasing State Uncertainty in Planning
Krivic, Senka (Kings College london) | Cashmore, Michael | Magazzeni, Daniele | Szedmak, Sandor | Piater, Justus
We present a novel approach for decreasing state uncertainty in planning prior to solving the planning problem. This is done by making predictions about the state based on currently known information, using machine learning techniques. For domains where uncertainty is high, we define an active learning process for identifying which information, once sensed, will best improve the accuracy of predictions. We demonstrate that an agent is able to solve problems with uncertainties in the state with less planning effort compared to standard planning techniques. Moreover, agents can solve problems for which they could not find valid plans without using predictions. Experimental results also demonstrate that using our active learning process for identifying information to be sensed leads to gathering information that improves the prediction process.
A decision-making tool to fine-tune abnormal levels in the complete blood count tests
Avalos-Fernandez, Marta, Touchais, Helene, Henriquez-Henriquez, Marcela
The complete blood count (CBC) performed by automated hematology analyzers is one of the most ordered laboratory tests. It is a first-line tool for assessing a patient's general health status, or diagnosing and monitoring disease progression. When the analysis does not fit an expected setting, technologists manually review a blood smear using a microscope. The International Consensus Group for Hematology Review published in 2005 a set of criteria for reviewing CBCs. Commonly, adjustments are locally needed to account for laboratory resources and populations characteristics. Our objective is to provide a decision support tool to identify which CBC variables are associated with higher risks of abnormal smear and at which cutoff values. We propose a cost-sensitive Lasso-penalized additive logistic regression combined with stability selection. Using simulated and real CBC data, we demonstrate that our tool correctly identify the true cutoff values, provided that there is enough available data in their neighbourhood.
Interpretable and synergistic deep learning for visual explanation and statistical estimations of segmentation of disease features from medical images
Ghosal, Sambuddha, Shah, Pratik
Deep learning (DL) models for disease classification or segmentation from medical images are increasingly trained using transfer learning (TL) from unrelated natural world images. However, shortcomings and utility of TL for specialized tasks in the medical imaging domain remain unknown and are based on assumptions that increasing training data will improve performance. We report detailed comparisons, rigorous statistical analysis and comparisons of widely used DL architecture for binary segmentation after TL with ImageNet initialization (TII-models) with supervised learning with only medical images(LMI-models) of macroscopic optical skin cancer, microscopic prostate core biopsy and Computed Tomography (CT) DICOM images. Through visual inspection of TII and LMI model outputs and their Grad-CAM counterparts, our results identify several counter intuitive scenarios where automated segmentation of one tumor by both models or the use of individual segmentation output masks in various combinations from individual models leads to 10% increase in performance. We also report sophisticated ensemble DL strategies for achieving clinical grade medical image segmentation and model explanations under low data regimes. For example; estimating performance, explanations and replicability of LMI and TII models described by us can be used for situations in which sparsity promotes better learning. A free GitHub repository of TII and LMI models, code and more than 10,000 medical images and their Grad-CAM output from this study can be used as starting points for advanced computational medicine and DL research for biomedical discovery and applications.
Building an Automated and Self-Aware Anomaly Detection System
Chakraborty, Sayan, Shah, Smit, Soltani, Kiumars, Swigart, Anna, Yang, Luyao, Buckingham, Kyle
Organizations rely heavily on time series metrics to measure and model key aspects of operational and business performance. The ability to reliably detect issues with these metrics is imperative to identifying early indicators of major problems before they become pervasive. It can be very challenging to proactively monitor a large number of diverse and constantly changing time series for anomalies, so there are often gaps in monitoring coverage, disabled or ignored monitors due to false positive alarms, and teams resorting to manual inspection of charts to catch problems. Traditionally, variations in the data generation processes and patterns have required strong modeling expertise to create models that accurately flag anomalies. In this paper, we describe an anomaly detection system that overcomes this common challenge by keeping track of its own performance and making changes as necessary to each model without requiring manual intervention. We demonstrate that this novel approach outperforms available alternatives on benchmark datasets in many scenarios.
Long-Term Pipeline Failure Prediction Using Nonparametric Survival Analysis
Weeraddana, Dilusha, MallawaArachchi, Sudaraka, Warnakula, Tharindu, Li, Zhidong, Wang, Yang
Australian water infrastructure is more than a hundred years old, thus has begun to show its age through water main failures. Our work concerns approximately half a million pipelines across major Australian cities that deliver water to houses and businesses, serving over five million customers. Failures on these buried assets cause damage to properties and water supply disruptions. We applied Machine Learning techniques to find a cost-effective solution to the pipe failure problem in these Australian cities, where on average 1500 of water main failures occur each year. To achieve this objective, we construct a detailed picture and understanding of the behaviour of the water pipe network by developing a Machine Learning model to assess and predict the failure likelihood of water main breaking using historical failure records, descriptors of pipes and other environmental factors. Our results indicate that our system incorporating a nonparametric survival analysis technique called "Random Survival Forest" outperforms several popular algorithms and expert heuristics in long-term prediction. In addition, we construct a statistical inference technique to quantify the uncertainty associated with the long-term predictions.
Supervised PCA: A Multiobjective Approach
Ritchie, Alexander, Balzano, Laura, Scott, Clayton
Methods for supervised principal component analysis (SPCA) aim to incorporate label information into principal component analysis (PCA), so that the extracted features are more useful for a prediction task of interest. Prior work on SPCA has focused primarily on optimizing prediction error, and has neglected the value of maximizing variance explained by the extracted features. We propose a new method for SPCA that addresses both of these objectives jointly, and demonstrate empirically that our approach dominates existing approaches, i.e., outperforms them with respect to both prediction error and variation explained. Our approach accommodates arbitrary supervised learning losses and, through a statistical reformulation, provides a novel low-rank extension of generalized linear models.
Automatic Detection of Influential Actors in Disinformation Networks
Smith, Steven T., Kao, Edward K., Mackin, Erika D., Shah, Danelle C., Simek, Olga, Rubin, Donald B.
The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IO). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing, machine learning, graph analytics, and a novel network causal inference approach to quantify the impact of individual actors in spreading IO narratives. We demonstrate its capability on real-world hostile IO campaigns with Twitter datasets collected during the 2017 French presidential elections, and known IO accounts disclosed by Twitter over a broad range of IO campaigns (May 2007-February 2020), over 50 thousand accounts, 17 countries, and different account types including both trolls and bots. Our system detects IO accounts with 96% precision, 79% recall, and 96% area-under-the-PR-curve, maps out salient network communities, and discovers high-impact accounts that escape the lens of traditional impact statistics based on activity counts and network centrality. Results are corroborated with independent sources of known IO accounts from U.S. Congressional reports, investigative journalism, and IO datasets provided by Twitter.
Differentially Private Synthetic Data: Applied Evaluations and Enhancements
Rosenblatt, Lucas, Liu, Xiaoyan, Pouyanfar, Samira, de Leon, Eduardo, Desai, Anuj, Allen, Joshua
Machine learning practitioners frequently seek to leverage the most informative available data, without violating the data owner's privacy, when building predictive models. Differentially private data synthesis protects personal details from exposure, and allows for the training of differentially private machine learning models on privately generated datasets. But how can we effectively assess the efficacy of differentially private synthetic data? In this paper, we survey four differentially private generative adversarial networks for data synthesis. We evaluate each of them at scale on five standard tabular datasets, and in two applied industry scenarios. Our results suggest some synthesizers are more applicable for different privacy budgets, and we further demonstrate complicating domain-based tradeoffs in selecting an approach. We offer experimental learning on applied machine learning scenarios with private internal data to researchers and practioners alike. In addition, we propose QUAIL, an ensemble-based modeling approach to generating synthetic data. We examine QUAIL's tradeoffs, and note circumstances in which it outperforms baseline differentially private supervised learning models under the same budget constraint. Maintaining an individual's privacy is a major concern when collecting sensitive information from groups or organizations. A formalization of privacy, known as differential privacy, has become the gold standard with which to protect information from malicious agents (Dwork et al., TAMC 2008).
Resource Constrained Dialog Policy Learning via Differentiable Inductive Logic Programming
Zhou, Zhenpeng, Beirami, Ahmad, Crook, Paul, Shah, Pararth, Subba, Rajen, Geramifard, Alborz
Motivated by the needs of resource constrained dialog policy learning, we introduce dialog policy via differentiable inductive logic (DILOG). We explore the tasks of one-shot learning and zero-shot domain transfer with DILOG on SimDial and MultiWoZ. Using a single representative dialog from the restaurant domain, we train DILOG on the SimDial dataset and obtain 99 % in-domain test accuracy. We also show that the trained DILOG zero-shot transfers to all other domains with 99 % accuracy, proving the suitability of DILOG to slot-filling dialogs. We further extend our study to the MultiWoZ dataset achieving 90 % inform and success metrics. We also observe that these metrics are not capturing some of the shortcomings of DILOG in terms of false positives, prompting us to measure an auxiliary Action F1 score. We show that DILOG is 100x more data efficient than state-of-the-art neural approaches on MultiWoZ while achieving similar performance metrics. We conclude with a discussion on the strengths and weaknesses of DILOG.