Performance Analysis
The Prominence of Artificial Intelligence in COVID-19
Nasim, MD Abdullah Al, Dhali, Aditi, Afrin, Faria, Zaman, Noshin Tasnim, Karim, Nazmul
In December 2019, a novel virus called COVID-19 had caused an enormous number of causalities to date. The battle with the novel Coronavirus is baffling and horrifying after the Spanish Flu 2019. While the front-line doctors and medical researchers have made significant progress in controlling the spread of the highly contiguous virus, technology has also proved its significance in the battle. Moreover, Artificial Intelligence has been adopted in many medical applications to diagnose many diseases, even baffling experienced doctors. Therefore, this survey paper explores the methodologies proposed that can aid doctors and researchers in early and inexpensive methods of diagnosis of the disease. Most developing countries have difficulties carrying out tests using the conventional manner, but a significant way can be adopted with Machine and Deep Learning. On the other hand, the access to different types of medical images has motivated the researchers. As a result, a mammoth number of techniques are proposed. This paper first details the background knowledge of the conventional methods in the Artificial Intelligence domain. Following that, we gather the commonly used datasets and their use cases to date. In addition, we also show the percentage of researchers adopting Machine Learning over Deep Learning. Thus we provide a thorough analysis of this scenario. Lastly, in the research challenges, we elaborate on the problems faced in COVID-19 research, and we address the issues with our understanding to build a bright and healthy environment.
UN-AVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring
Yousef, Waleed A., Traore, Issa, Briguglio, William
The visualization and detection of anomalies (outliers) are of crucial importance to many fields, particularly cybersecurity. Several approaches have been proposed in these fields, yet to the best of our knowledge, none of them has fulfilled both objectives, simultaneously or cooperatively, in one coherent framework. The visualization methods of these approaches were introduced for explaining the output of a detection algorithm, not for data exploration that facilitates a standalone visual detection. This is our point of departure: UN-AVOIDS, an unsupervised and nonparametric approach for both visualization (a human process) and detection (an algorithmic process) of outliers, that assigns invariant anomalous scores (normalized to $[0,1]$), rather than hard binary-decision. The main aspect of novelty of UN-AVOIDS is that it transforms data into a new space, which is introduced in this paper as neighborhood cumulative density function (NCDF), in which both visualization and detection are carried out. In this space, outliers are remarkably visually distinguishable, and therefore the anomaly scores assigned by the detection algorithm achieved a high area under the ROC curve (AUC). We assessed UN-AVOIDS on both simulated and two recently published cybersecurity datasets, and compared it to three of the most successful anomaly detection methods: LOF, IF, and FABOD. In terms of AUC, UN-AVOIDS was almost an overall winner. The article concludes by providing a preview of new theoretical and practical avenues for UN-AVOIDS. Among them is designing a visualization aided anomaly detection (VAAD), a type of software that aids analysts by providing UN-AVOIDS' detection algorithm (running in a back engine), NCDF visualization space (rendered to plots), along with other conventional methods of visualization in the original feature space, all of which are linked in one interactive environment.
Enhanced Membership Inference Attacks against Machine Learning Models
Ye, Jiayuan, Maddi, Aadyaa, Murakonda, Sasi Kumar, Shokri, Reza
How much does a given trained model leak about each individual data record in its training set? Membership inference attacks are used as an auditing tool to quantify the private information that a model leaks about the individual data points in its training set. Membership inference attacks are influenced by different uncertainties that an attacker has to resolve about training data, the training algorithm, and the underlying data distribution. Thus attack success rates, of many attacks in the literature, do not precisely capture the information leakage of models about their data, as they also reflect other uncertainties that the attack algorithm has. In this paper, we explain the implicit assumptions and also the simplifications made in prior work using the framework of hypothesis testing. We also derive new attack algorithms from the framework that can achieve a high AUC score while also highlighting the different factors that affect their performance. Our algorithms capture a very precise approximation of privacy loss in models, and can be used as a tool to perform an accurate and informed estimation of privacy risk in machine learning models. We provide a thorough empirical evaluation of our attack strategies on various machine learning tasks and benchmark datasets.
DeepGuard: A Framework for Safeguarding Autonomous Driving Systems from Inconsistent Behavior
Hussain, Manzoor, Ali, Nazakat, Hong, Jang-Eui
Abstract-- The deep neural networks (DNNs)-based autonomous driving systems (ADSs) are expected to reduce road accidents and improve safety in the transportation domain as it removes the factor of human error from driving tasks. The DNN-based ADS sometimes may exhibit erroneous or unexpected behaviours due to unexpected driving conditions which may cause accidents. Therefore, safety assurance is vital to the ADS. However, DNN-based ADS is a highly complex system that puts forward a strong demand for robustness, more specifically, the ability to predict unexpected driving conditions to prevent potential inconsistent behaviour. It is not possible to generalize the DNN model's performance for all driving conditions. Therefore, the driving conditions that were not considered during the training of the ADS may lead to unpredictable consequences for the safety of autonomous vehicles. This study proposes an autoencoder and time series analysis-based anomaly detection system to prevent the safety-critical inconsistent behaviour of autonomous vehicles at runtime. Our approach called DeepGuard consists of two components. The first component-the inconsistent behaviour predictor, is based on an autoencoder and time series analysis to reconstruct the driving scenarios. Based on reconstruction error (e) and threshold (θ), it determines the normal and unexpected driving scenarios and predicts potential inconsistent behaviour. The second component provides on-the-fly safety guards, that is, it automatically activates healing strategies to prevent inconsistencies in the behaviour. We evaluated the performance of DeepGuard in predicting the injected anomalous driving scenarios using already available open-sourced DNN-based ADSs in the Udacity simulator. Our simulation results show that the best variant of DeepGuard can predict up to 93 % on the CHAUFFEUR ADS, 83 % on DAVE-2 ADS, and 80 % of inconsistent behaviour on the EPOCH ADS model, outperforming SELFORACLE and DeepRoad. Overall, DeepGuard can prevent up to 89% of all predicted inconsistent behaviours of ADS by executing predefined safety guards. I. INTRODUCTION Autonomous vehicles are one of the most promising applications of artificial intelligence. This would be a technological revolution in the transportation industry in the near future. Autonomous driving systems (ADSs) use sensors such as cameras, radar, Lidar, and GPS to automatically produce driving parameters such as vehicle velocity, throttle, brakes, steering angles, and directions. Advancements in deep learning have made progress in autonomous systems, such as autonomous vehicles and unmanned aerial vehicles.
Airport Taxi Time Prediction and Alerting: A Convolutional Neural Network Approach
Vargo, Erik, Tien, Alex, Jafari, Arian
Taxi-out time is an indicator of departure efficiency and is often the early signal of large holding and diversion events for airports that have constrained surface space. This is one of the real-time performance metrics that is of great interest to air traffic managers and flight dispatchers. For a busy airport that has limited tarmac space like LaGuardia Airport (LGA), an increasing average taxi-out time under a deteriorating visibility condition could soon lead to surface gridlock that would cause significant delays to both arrivals and departures. Thus, research is needed to develop an early alert or prediction of long taxi-out times to enable early delay mitigation actions. The problem of predicting taxi-out times has received considerable treatment in the aviation literature. Most research exploring the domain of taxi time prediction has focused on predicting taxi-out times for individual aircraft. Inaccurate taxi-out times can lead to a variety of National Airspace System (NAS) inefficiencies, such as a reduction in predictability for downstream Traffic Flow Management (TFM) applications and excess fuel consumption after push back from the gate. By better predicting aircraft-specific taxi-out times, informed updates can be made to the flight schedule to improve predictability and more efficiently use available NAS resources (e.g., capacity). Although our focus is on predicting average taxi-out time, it's worth reviewing the literature on aircraft-specific taxi-out time predictions for historical context.
Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence
Bai, Xiang, Wang, Hanchen, Ma, Liya, Xu, Yongchao, Gan, Jiefeng, Fan, Ziwei, Yang, Fan, Ma, Ke, Yang, Jiehua, Bai, Song, Shu, Chang, Zou, Xinyu, Huang, Renhao, Zhang, Changzheng, Liu, Xiaowu, Tu, Dandan, Xu, Chuou, Zhang, Wenqing, Wang, Xi, Chen, Anguo, Zeng, Yu, Yang, Dehua, Wang, Ming-Wei, Holalkere, Nagaraj, Halin, Neil J., Kamel, Ihab R., Wu, Jia, Peng, Xuehua, Wang, Xiang, Shao, Jianbo, Mongkolwat, Pattanasak, Zhang, Jianjun, Liu, Weiyang, Roberts, Michael, Teng, Zhongzhao, Beer, Lucian, Sanchez, Lorena Escudero, Sala, Evis, Rubin, Daniel, Weller, Adrian, Lasenby, Joan, Zheng, Chuangsheng, Wang, Jianming, Li, Zhen, Schönlieb, Carola-Bibiane, Xia, Tian
Title: Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence One sentence summary: An efficient and effective privacy-preserving AI framework is proposed for CT-based COVID-19 diagnosis, based on 9,573 CT scans of 3,336 patients, from 23 hospitals in China and the UK. Abstract Artificial intelligence (AI) provides a promising substitution for streamlining COVID-19 diagnoses. However, concerns surrounding security and trustworthiness impede the collection of large-scale representative medical data, posing a considerable challenge for training a well-generalised model in clinical practices. To address this, we launch the Unified CT-COVID AI Diagnostic Initiative (UCADI), where the AI model can be distributedly trained and independently executed at each host institution under a federated learning framework (FL) without data sharing. Here we show that our FL model outperformed all the local models by a large yield (test sensitivity /specificity in China: 0.973/0.951, in the UK: 0.730/0.942), We further evaluated the model on the hold-out (collected from another two hospitals leaving out the FL) and heterogeneous (acquired with contrast materials) data, provided visual explanations for decisions made by the model, and analysed the trade-offs between the model performance and the communication costs in the federated training process. Our study is based on 9,573 chest computed tomography scans (CTs) from 3,336 patients collected from 23 hospitals located in China and the UK. Collectively, our work advanced the prospects of utilising federated learning for privacy-preserving AI in digital health. MAIN TEXT Introduction As the gold standard for identifying COVID-19 carriers, reverse transcription-polymerase chain reaction (RT-PCR) is the primary diagnostic modality to detect viral nucleotide in specimens from cases with suspected infection. It has been reported that coronavirus carriers present certain radiological features in chest CTs, including ground-glass opacity, interlobular septal thickening, and consolidation, which can be exploited to identify COVID-19 cases.
Explaining Machine Learning by Bootstrapping Partial Dependence Functions and Shapley Values
Machine learning and artificial intelligence methods are often referred to as "black boxes" when compared with traditional regression-based approaches. However, both traditional and machine learning methods are concerned with modeling the joint distribution between endogenous (target) and exogenous (input) variables. Where linear models describe the fitted relationship between the target and input variables via the slope of that relationship (coefficient estimates), the same fitted relationship can be described rigorously for any machine learning model by first-differencing the partial dependence functions. Bootstrapping these first-differenced functionals provides standard errors and confidence intervals for the estimated relationships. We show that this approach replicates the point estimates of OLS coefficients and demonstrate how this generalizes to marginal relationships in machine learning and artificial intelligence models. We further discuss the relationship of partial dependence functions to Shapley value decompositions and explore how they can be used to further explain model outputs.
Interpretable and Fair Boolean Rule Sets via Column Generation
Lawless, Connor, Dash, Sanjeeb, Gunluk, Oktay, Wei, Dennis
This paper considers the learning of Boolean rules in either disjunctive normal form (DNF, OR-of-ANDs, equivalent to decision rule sets) or conjunctive normal form (CNF, AND-of-ORs) as an interpretable model for classification. An integer program is formulated to optimally trade classification accuracy for rule simplicity. We also consider the fairness setting and extend the formulation to include explicit constraints on two different measures of classification parity: equality of opportunity and equalized odds. Column generation (CG) is used to efficiently search over an exponential number of candidate clauses (conjunctions or disjunctions) without the need for heuristic rule mining. This approach also bounds the gap between the selected rule set and the best possible rule set on the training data. To handle large datasets, we propose an approximate CG algorithm using randomization. Compared to three recently proposed alternatives, the CG algorithm dominates the accuracy-simplicity trade-off in 8 out of 16 datasets. When maximized for accuracy, CG is competitive with rule learners designed for this purpose, sometimes finding significantly simpler solutions that are no less accurate. Compared to other fair and interpretable classifiers, our method is able to find rule sets that meet stricter notions of fairness with a modest trade-off in accuracy.
Machine Learning and Ensemble Approach Onto Predicting Heart Disease
The four essential chambers of one's heart that lie in the thoracic cavity are crucial for one's survival, yet ironically prove to be the most vulnerable. Cardiovascular disease (CVD) also commonly referred to as heart disease has steadily grown to the leading cause of death amongst humans over the past few decades. Taking this concerning statistic into consideration, it is evident that patients suffering from CVDs need a quick and correct diagnosis in order to facilitate early treatment to lessen the chances of fatality. This paper attempts to utilize the data provided to train classification models such as Logistic Regression, K Nearest Neighbors, Support Vector Machine, Decision Tree, Gaussian Naive Bayes, Random Forest, and Multi-Layer Perceptron (Artificial Neural Network) and eventually using a soft voting ensemble technique in order to attain as many correct diagnoses as possible.
Point detection through multi-instance deep heatmap regression for sutures in endoscopy
Sharan, Lalith, Romano, Gabriele, Brand, Julian, Kelm, Halvar, Karck, Matthias, De Simone, Raffaele, Engelhardt, Sandy
Purpose: Mitral valve repair is a complex minimally invasive surgery of the heart valve. In this context, suture detection from endoscopic images is a highly relevant task that provides quantitative information to analyse suturing patterns, assess prosthetic configurations and produce augmented reality visualisations. Facial or anatomical landmark detection tasks typically contain a fixed number of landmarks, and use regression or fixed heatmap-based approaches to localize the landmarks. However in endoscopy, there are a varying number of sutures in every image, and the sutures may occur at any location in the annulus, as they are not semantically unique. Method: In this work, we formulate the suture detection task as a multi-instance deep heatmap regression problem, to identify entry and exit points of sutures. We extend our previous work, and introduce the novel use of a 2D Gaussian layer followed by a differentiable 2D spatial Soft-Argmax layer to function as a local non-maximum suppression. Results: We present extensive experiments with multiple heatmap distribution functions and two variants of the proposed model. In the intra-operative domain, Variant 1 showed a mean F1 of +0.0422 over the baseline. Similarly, in the simulator domain, Variant 1 showed a mean F1 of +0.0865 over the baseline. Conclusion: The proposed model shows an improvement over the baseline in the intra-operative and the simulator domains. The data is made publicly available within the scope of the MICCAI AdaptOR2021 Challenge https://adaptor2021.github.io/, and the code at https://github.com/Cardio-AI/suture-detection-pytorch/. DOI:10.1007/s11548-021-02523-w. The link to the open access article can be found here: https://link.springer.com/article/10.1007%2Fs11548-021-02523-w