Regression
Work In Progress: Safety and Robustness Verification of Autoencoder-Based Regression Models using the NNV Tool
Pal, Neelanjana, Johnson, Taylor T
State-of-the-art and well-trained neural networks (NN) can easily be attacked by small perturbations in inputs, leading to significant aberrations in their outputs [14, 23, 33]. These input perturbations are not only limited to image-based networks but also apply to other input types as well, e.g., time-series data or input signals. Such lack of robustness poses serious risks to information integrity, privacy and security, and can be catastrophic in safety-critical applications [11, 29]. While verification of NNs with image inputs is a vastly growing research area; specifically, with recent ongoing works on safety and robustness checking of feedforward (FFNN), convolutional (CNN), and semantic segmentation networks (SSN); less has been done in the domain of autoencoder verification. Classification models using autoencoders work almost similar to usual classifiers, but there is a need for new research to develop verification techniques for regression models. The regression-based autoencoders regenerate the input in its output and thus can be checked using verification techniques whether the recreated output comes within a certain accepted range of the unperturbed input, in case there is a certain fault/attack on its input side. In a prior work, the authors of [36] introduced a novel framework for NN verification named Neural Network Verification (NNV) [38] tool, capable of evaluating the robustness of several DNN architectures, e.g., FFNN, CNN, SSN, etc. Later, a new set-based approach, Imagestar [34, 36] is also incorporated into this tool. In this work in progress work, we explore similar methods in the context of autoencoder verification via experimenting on a sampled dataset and checking if the output lies within a pre-determined safe threshold around the corresponding uninterrupted input values, given a specific type of fault in the input.
Feature robustness and sex differences in medical imaging: a case study in MRI-based Alzheimer's disease detection
Petersen, Eike, Feragen, Aasa, Zemsch, Maria Luise da Costa, Henriksen, Anders, Christensen, Oskar Eiler Wiese, Ganz, Melanie
Convolutional neural networks have enabled significant improvements in medical image-based diagnosis. It is, however, increasingly clear that these models are susceptible to performance degradation when facing spurious correlations and dataset shift, leading, e.g., to underperformance on underrepresented patient groups. In this paper, we compare two classification schemes on the ADNI MRI dataset: a simple logistic regression model using manually selected volumetric features, and a convolutional neural network trained on 3D MRI data. We assess the robustness of the trained models in the face of varying dataset splits, training set sex composition, and stage of disease. In contrast to earlier work in other imaging modalities, we do not observe a clear pattern of improved model performance for the majority group in the training dataset. Instead, while logistic regression is fully robust to dataset composition, we find that CNN performance is generally improved for both male and female subjects when including more female subjects in the training dataset. We hypothesize that this might be due to inherent differences in the pathology of the two sexes. Moreover, in our analysis, the logistic regression model outperforms the 3D CNN, emphasizing the utility of manual feature specification based on prior knowledge, and the need for more robust automatic feature selection.
Assortment Optimization with Customer Choice Modeling in a Crowdfunding Setting
Crowdfunding, which is the act of raising funds from a large number of people's contributions, is among the most popular research topics in economic theory. Due to the fact that crowdfunding platforms (CFPs) have facilitated the process of raising funds by offering several features, we should take their existence and survival in the marketplace into account. In this study, we investigated the significant role of platform features in a customer behavioral choice model. In particular, we proposed a multinomial logit model to describe the customers' (backers') behavior in a crowdfunding setting. We proceed by discussing the revenue-sharing model in these platforms. For this purpose, we conclude that an assortment optimization problem could be of major importance in order to maximize the platforms' revenue. We were able to derive a reasonable amount of data in some cases and implement two well-known machine learning methods such as multivariate regression and classification problems to predict the best assortments the platform could offer to every arriving customer. We compared the results of these two methods and investigated how well they perform in all cases.
Introduction to Logistic Regression
In this blog, we will discuss the basic concepts of Logistic Regression and what kind of problems can it help us to solve. Logistic regression is a classification algorithm used to assign observations to a discrete set of classes. Some of the examples of classification problems are Email spam or not spam, Online transactions Fraud or not Fraud, Tumor Malignant or Benign. Logistic regression transforms its output using the logistic sigmoid function to return a probability value. Logistic Regression is a Machine Learning algorithm which is used for the classification problems, it is a predictive analysis algorithm and based on the concept of probability.
Prediction of the motion of chest internal points using a recurrent neural network trained with real-time recurrent learning for latency compensation in lung cancer radiotherapy
Pohl, Michel, Uesaka, Mitsuru, Demachi, Kazuyuki, Chhatkuli, Ritu Bhusal
During the radiotherapy treatment of patients with lung cancer, the radiation delivered to healthy tissue around the tumor needs to be minimized, which is difficult because of respiratory motion and the latency of linear accelerator systems. In the proposed study, we first use the Lucas-Kanade pyramidal optical flow algorithm to perform deformable image registration of chest computed tomography scan images of four patients with lung cancer. We then track three internal points close to the lung tumor based on the previously computed deformation field and predict their position with a recurrent neural network (RNN) trained using real-time recurrent learning (RTRL) and gradient clipping. The breathing data is quite regular, sampled at approximately 2.5Hz, and includes artificial drift in the spine direction. The amplitude of the motion of the tracked points ranged from 12.0mm to 22.7mm. Finally, we propose a simple method for recovering and predicting 3D tumor images from the tracked points and the initial tumor image based on a linear correspondence model and Nadaraya-Watson non-linear regression. The root-mean-square error, maximum error, and jitter corresponding to the RNN prediction on the test set were smaller than the same performance measures obtained with linear prediction and least mean squares (LMS). In particular, the maximum prediction error associated with the RNN, equal to 1.51mm, is respectively 16.1% and 5.0% lower than the maximum error associated with linear prediction and LMS. The average prediction time per time step with RTRL is equal to 119ms, which is less than the 400ms marker position sampling time. The tumor position in the predicted images appears visually correct, which is confirmed by the high mean cross-correlation between the original and predicted images, equal to 0.955.
Estimation of Soft Robotic Bladder Compression for Smart Helmets using IR Range Finding and Hall Effect Magnetic Sensing
Pollard, Colin, Aston, Jonathan, Minor, Mark A.
This research focuses on soft robotic bladders that are used to monitor and control the interaction between a user's head and the shell of a Smart Helmet. Compression of these bladders determines impact dissipation; hence the focus of this paper is sensing and estimation of bladder compression. An IR rangefinder-based solution is evaluated using regression techniques as well as a Neural Network to estimate bladder compression. A Hall-Effect (HE) magnetic sensing system is also examined where HE sensors embedded in the base of the bladder sense the position of a magnet in the top of the bladder. The paper presents the HE sensor array, signal processing of HE voltage data, and then a Neural Network (NN) for predicting bladder compression. Efficacy of different training data sets on NN performance is studied. Different NN configurations are examined to determine a configuration that provides accurate estimates with as few nodes as possible. Different bladder compression profiles are evaluated to characterize IR range finding and HE based techniques in application scenarios.
Logistic Regression in One Picture - DataScienceCentral.com
Logistic regression is regressing data to a line (i.e. This type of regression is a good choice when modeling binary variables, which happen frequently in real life (e.g. The logistic regression model is popular, in part, because it gives probabilities between 0 and 1. Let's say you were modeling a risk of credit default: values closer to 0 indicate a tiny risk, while values closer to 1 mean a very high risk. The following image shows an example of how one might tailor a logistic model for credit score based risk.
Primary Supervised Learning Algorithms Used in Machine Learning - KDnuggets
Supervised learning is a machine learning subset where a machine learning model is trained on labeled (inputs) data. As a result, the supervised model is capable of predicting further outcomes (outputs) as accurately as possible. The concept behind supervised learning can be explained from real-life scenarios such as a teacher tutoring a child about a new topic for the first time. For simplification, let's say that the teacher wants to teach the child to successfully identify the image of a cat and a dog. The teacher will start the tutoring process by continuously showing the child images of either a cat or a dog with the addition of having the teacher inform the child if the image is that of a dog or a cat.
Coronavirus disease situation analysis and prediction using machine learning: a study on Bangladeshi population
Nayan, Al-Akhir, Kijsirikul, Boonserm, Iwahori, Yuji
During a pandemic, early prognostication of patient infected rates can reduce the death by ensuring treatment facility and proper resource allocation. In recent months, the number of death and infected rates has increased more distinguished than before in Bangladesh. The country is struggling to provide moderate medical treatment to many patients. This study distinguishes machine learning models and creates a prediction system to anticipate the infected and death rate for the coming days. Equipping a dataset with data from March 1, 2020, to August 10, 2021, a multi-layer perceptron (MLP) model was trained. The data was managed from a trusted government website and concocted manually for training purposes. Several test cases determine the model's accuracy and prediction capability. The comparison between specific models assumes that the MLP model has more reliable prediction capability than the support vector regression (SVR) and linear regression model. The model presents a report about the risky situation and impending coronavirus disease (COVID-19) attack. According to the prediction produced by the model, Bangladesh may suffer another COVID-19 attack, where the number of infected cases can be between 929 to 2443 and death cases between 19 to 57.
Long Term Fairness for Minority Groups via Performative Distributionally Robust Optimization
Peet-Pare, Liam, Hegde, Nidhi, Fyshe, Alona
Fairness researchers in machine learning (ML) have coalesced around several fairness criteria which provide formal definitions of what it means for an ML model to be fair. However, these criteria have some serious limitations. We identify four key shortcomings of these formal fairness criteria, and aim to help to address them by extending performative prediction to include a distributionally robust objective.