Performance Analysis
Interpretable and synergistic deep learning for visual explanation and statistical estimations of segmentation of disease features from medical images
Ghosal, Sambuddha, Shah, Pratik
Deep learning (DL) models for disease classification or segmentation from medical images are increasingly trained using transfer learning (TL) from unrelated natural world images. However, shortcomings and utility of TL for specialized tasks in the medical imaging domain remain unknown and are based on assumptions that increasing training data will improve performance. We report detailed comparisons, rigorous statistical analysis and comparisons of widely used DL architecture for binary segmentation after TL with ImageNet initialization (TII-models) with supervised learning with only medical images(LMI-models) of macroscopic optical skin cancer, microscopic prostate core biopsy and Computed Tomography (CT) DICOM images. Through visual inspection of TII and LMI model outputs and their Grad-CAM counterparts, our results identify several counter intuitive scenarios where automated segmentation of one tumor by both models or the use of individual segmentation output masks in various combinations from individual models leads to 10% increase in performance. We also report sophisticated ensemble DL strategies for achieving clinical grade medical image segmentation and model explanations under low data regimes. For example; estimating performance, explanations and replicability of LMI and TII models described by us can be used for situations in which sparsity promotes better learning. A free GitHub repository of TII and LMI models, code and more than 10,000 medical images and their Grad-CAM output from this study can be used as starting points for advanced computational medicine and DL research for biomedical discovery and applications.
Building an Automated and Self-Aware Anomaly Detection System
Chakraborty, Sayan, Shah, Smit, Soltani, Kiumars, Swigart, Anna, Yang, Luyao, Buckingham, Kyle
Organizations rely heavily on time series metrics to measure and model key aspects of operational and business performance. The ability to reliably detect issues with these metrics is imperative to identifying early indicators of major problems before they become pervasive. It can be very challenging to proactively monitor a large number of diverse and constantly changing time series for anomalies, so there are often gaps in monitoring coverage, disabled or ignored monitors due to false positive alarms, and teams resorting to manual inspection of charts to catch problems. Traditionally, variations in the data generation processes and patterns have required strong modeling expertise to create models that accurately flag anomalies. In this paper, we describe an anomaly detection system that overcomes this common challenge by keeping track of its own performance and making changes as necessary to each model without requiring manual intervention. We demonstrate that this novel approach outperforms available alternatives on benchmark datasets in many scenarios.
Long-Term Pipeline Failure Prediction Using Nonparametric Survival Analysis
Weeraddana, Dilusha, MallawaArachchi, Sudaraka, Warnakula, Tharindu, Li, Zhidong, Wang, Yang
Australian water infrastructure is more than a hundred years old, thus has begun to show its age through water main failures. Our work concerns approximately half a million pipelines across major Australian cities that deliver water to houses and businesses, serving over five million customers. Failures on these buried assets cause damage to properties and water supply disruptions. We applied Machine Learning techniques to find a cost-effective solution to the pipe failure problem in these Australian cities, where on average 1500 of water main failures occur each year. To achieve this objective, we construct a detailed picture and understanding of the behaviour of the water pipe network by developing a Machine Learning model to assess and predict the failure likelihood of water main breaking using historical failure records, descriptors of pipes and other environmental factors. Our results indicate that our system incorporating a nonparametric survival analysis technique called "Random Survival Forest" outperforms several popular algorithms and expert heuristics in long-term prediction. In addition, we construct a statistical inference technique to quantify the uncertainty associated with the long-term predictions.
Supervised PCA: A Multiobjective Approach
Ritchie, Alexander, Balzano, Laura, Scott, Clayton
Methods for supervised principal component analysis (SPCA) aim to incorporate label information into principal component analysis (PCA), so that the extracted features are more useful for a prediction task of interest. Prior work on SPCA has focused primarily on optimizing prediction error, and has neglected the value of maximizing variance explained by the extracted features. We propose a new method for SPCA that addresses both of these objectives jointly, and demonstrate empirically that our approach dominates existing approaches, i.e., outperforms them with respect to both prediction error and variation explained. Our approach accommodates arbitrary supervised learning losses and, through a statistical reformulation, provides a novel low-rank extension of generalized linear models.
Automatic Detection of Influential Actors in Disinformation Networks
Smith, Steven T., Kao, Edward K., Mackin, Erika D., Shah, Danelle C., Simek, Olga, Rubin, Donald B.
The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IO). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing, machine learning, graph analytics, and a novel network causal inference approach to quantify the impact of individual actors in spreading IO narratives. We demonstrate its capability on real-world hostile IO campaigns with Twitter datasets collected during the 2017 French presidential elections, and known IO accounts disclosed by Twitter over a broad range of IO campaigns (May 2007-February 2020), over 50 thousand accounts, 17 countries, and different account types including both trolls and bots. Our system detects IO accounts with 96% precision, 79% recall, and 96% area-under-the-PR-curve, maps out salient network communities, and discovers high-impact accounts that escape the lens of traditional impact statistics based on activity counts and network centrality. Results are corroborated with independent sources of known IO accounts from U.S. Congressional reports, investigative journalism, and IO datasets provided by Twitter.
Machine-Learning Dessins d'Enfants: Explorations via Modular and Seiberg-Witten Curves
He, Yang-Hui, Hirst, Edward, Peterken, Toby
Having learnt of the remarkable theorem of Bely ห ฤฑ [1] which relates the existence of algebraic models of Riemann surfaces to that of analytic properties of rational functions thereon, Grothendieck launched an entire programme [2] by pictorially representing 1 this structure as bipartite graphs (the dessin) drawn on the Riemann surface. He hypothesised dessins d'enfants in their current form as a conceptual representation of the absolute Galois group over the rationals, one the most mysterious and least understood objects in number theory. Subsequently, he developed a generalisation of Bely ห ฤฑ's theorem which extends the surfaces considered in the mapping to more general Riemann surfaces. Properties of the mapping are identified with combinatorial invariants of the dessin d'enfant graphs [2] (q.v.
Differentially Private Synthetic Data: Applied Evaluations and Enhancements
Rosenblatt, Lucas, Liu, Xiaoyan, Pouyanfar, Samira, de Leon, Eduardo, Desai, Anuj, Allen, Joshua
Machine learning practitioners frequently seek to leverage the most informative available data, without violating the data owner's privacy, when building predictive models. Differentially private data synthesis protects personal details from exposure, and allows for the training of differentially private machine learning models on privately generated datasets. But how can we effectively assess the efficacy of differentially private synthetic data? In this paper, we survey four differentially private generative adversarial networks for data synthesis. We evaluate each of them at scale on five standard tabular datasets, and in two applied industry scenarios. Our results suggest some synthesizers are more applicable for different privacy budgets, and we further demonstrate complicating domain-based tradeoffs in selecting an approach. We offer experimental learning on applied machine learning scenarios with private internal data to researchers and practioners alike. In addition, we propose QUAIL, an ensemble-based modeling approach to generating synthetic data. We examine QUAIL's tradeoffs, and note circumstances in which it outperforms baseline differentially private supervised learning models under the same budget constraint. Maintaining an individual's privacy is a major concern when collecting sensitive information from groups or organizations. A formalization of privacy, known as differential privacy, has become the gold standard with which to protect information from malicious agents (Dwork et al., TAMC 2008).
Resource Constrained Dialog Policy Learning via Differentiable Inductive Logic Programming
Zhou, Zhenpeng, Beirami, Ahmad, Crook, Paul, Shah, Pararth, Subba, Rajen, Geramifard, Alborz
Motivated by the needs of resource constrained dialog policy learning, we introduce dialog policy via differentiable inductive logic (DILOG). We explore the tasks of one-shot learning and zero-shot domain transfer with DILOG on SimDial and MultiWoZ. Using a single representative dialog from the restaurant domain, we train DILOG on the SimDial dataset and obtain 99 % in-domain test accuracy. We also show that the trained DILOG zero-shot transfers to all other domains with 99 % accuracy, proving the suitability of DILOG to slot-filling dialogs. We further extend our study to the MultiWoZ dataset achieving 90 % inform and success metrics. We also observe that these metrics are not capturing some of the shortcomings of DILOG in terms of false positives, prompting us to measure an auxiliary Action F1 score. We show that DILOG is 100x more data efficient than state-of-the-art neural approaches on MultiWoZ while achieving similar performance metrics. We conclude with a discussion on the strengths and weaknesses of DILOG.
HHAR-net: Hierarchical Human Activity Recognition using Neural Networks
Fazli, Mehrdad, Kowsari, Kamran, Gharavi, Erfaneh, Barnes, Laura, Doryab, Afsaneh
Activity recognition using built-in sensors in smart and wearable devices provides great opportunities to understand and detect human behavior in the wild and gives a more holistic view of individuals' health and well being. Numerous computational methods have been applied to sensor streams to recognize different daily activities. However, most methods are unable to capture different layers of activities concealed in human behavior. Also, the performance of the models starts to decrease with increasing the number of activities. This research aims at building a hierarchical classification with Neural Networks to recognize human activities based on different levels of abstraction. We evaluate our model on the Extrasensory dataset; a dataset collected in the wild and containing data from smartphones and smartwatches. We use a two-level hierarchy with a total of six mutually exclusive labels namely, "lying down", "sitting", "standing in place", "walking", "running", and "bicycling" divided into "stationary" and "non-stationary". The results show that our model can recognize low-level activities (stationary/non-stationary) with 95.8% accuracy and overall accuracy of 92.8% over six labels. This is 3% above our best performing baseline.
Energy-based Out-of-distribution Detection
Liu, Weitang, Wang, Xiaoyun, Owens, John D., Li, Yixuan
Determining whether inputs are out-of-distribution (OOD) is an essential building block for safely deploying machine learning models in the open world. However, previous methods relying on the softmax confidence score suffer from overconfident posterior distributions for OOD data. We propose a unified framework for OOD detection that uses an energy score. We show that energy scores better distinguish in- and out-of-distribution samples than the traditional approach using the softmax scores. Unlike softmax confidence scores, energy scores are theoretically aligned with the probability density of the inputs and are less susceptible to the overconfidence issue. Within this framework, energy can be flexibly used as a scoring function for any pre-trained neural classifier as well as a trainable cost function to shape the energy surface explicitly for OOD detection. On a CIFAR-10 pre-trained WideResNet, using the energy score reduces the average FPR (at TPR 95%) by 18.03% compared to the softmax confidence score. With energy-based training, our method outperforms the state-of-the-art on common benchmarks.