Performance Analysis
Target alignment in truncated kernel ridge regression
Amini, Arash A., Baumgartner, Richard, Feng, Dai
Kernel ridge regression (KRR) has recently attracted renewed interest due to its potential for explaining the transient effects, such as double descent, that emerge during neural network training. In this work, we study how the alignment between the target function and the kernel affects the performance of the KRR. We focus on the truncated KRR (TKRR) which utilizes an additional parameter that controls the spectral truncation of the kernel matrix. We show that for polynomial alignment, there is an \emph{over-aligned} regime, in which TKRR can achieve a faster rate than what is achievable by full KRR. The rate of TKRR can improve all the way to the parametric rate, while that of full KRR is capped at a sub-optimal value. This shows that target alignemnt can be better leveraged by utilizing spectral truncation in kernel methods. We also consider the bandlimited alignment setting and show that the regularization surface of TKRR can exhibit transient effects including multiple descent and non-monotonic behavior. Our results show that there is a strong and quantifable relation between the shape of the \emph{alignment spectrum} and the generalization performance of kernel methods, both in terms of rates and in finite samples.
Apple ML Researchers Develop 'Neo': A Visual Analytics System That Enables Machine Learning Practitioners To Generalize Confusion Matrix Visualization to Hierarchical and Multi-Output Labels
In Machine Learning (ML), model evaluation is the most challenging step. The confusion matrix is one of the globally utilized performance metrics to evaluate the model for classification tasks. It is also a visualization tool that many ML courses and researchers have used. Moreover, it is a table with two dimensions, i.e., actual class label and predicted class label. The actual class label is represented by a row, while a column in the confusion matrix represents the predicted class label.
Local Evaluation of Time Series Anomaly Detection Algorithms
Huet, Alexis, Navarro, Jose Manuel, Rossi, Dario
In recent years, specific evaluation metrics for time series anomaly detection algorithms have been developed to handle the limitations of the classical precision and recall. However, such metrics are heuristically built as an aggregate of multiple desirable aspects, introduce parameters and wipe out the interpretability of the output. In this article, we first highlight the limitations of the classical precision/recall, as well as the main issues of the recent event-based metrics -- for instance, we show that an adversary algorithm can reach high precision and recall on almost any dataset under weak assumption. To cope with the above problems, we propose a theoretically grounded, robust, parameter-free and interpretable extension to precision/recall metrics, based on the concept of ``affiliation'' between the ground truth and the prediction sets. Our metrics leverage measures of duration between ground truth and predictions, and have thus an intuitive interpretation. By further comparison against random sampling, we obtain a normalized precision/recall, quantifying how much a given set of results is better than a random baseline prediction. By construction, our approach keeps the evaluation local regarding ground truth events, enabling fine-grained visualization and interpretation of algorithmic results. We compare our proposal against various public time series anomaly detection datasets, algorithms and metrics. We further derive theoretical properties of the affiliation metrics that give explicit expectations about their behavior and ensure robustness against adversary strategies.
Adaptive Step Size Learning with Applications to Velocity Aided Inertial Navigation System
Autonomous underwater vehicles (AUV) are commonly used in many underwater applications. Recently, the usage of multi-rotor unmanned autonomous vehicles (UAV) for marine applications is receiving more attention in the literature. Usually, both platforms employ an inertial navigation system (INS), and aiding sensors for an accurate navigation solution. In AUV navigation, Doppler velocity log (DVL) is mainly used to aid the INS, while for UAVs, it is common to use global navigation satellite systems (GNSS) receivers. The fusion between the aiding sensor and the INS requires a definition of step size parameter in the estimation process. It is responsible for the solution frequency update and, eventually, its accuracy. The choice of the step size poses a tradeoff between computational load and navigation performance. Generally, the aiding sensors update frequency is considered much slower compared to the INS operating frequency (hundreds Hertz). Such high rate is unnecessary for most platforms, specifically for low dynamics AUVs. In this work, a supervised machine learning based adaptive tuning scheme to select the proper INS step size is proposed. To that end, a velocity error bound is defined, allowing the INS/DVL or the INS/GNSS to act in a sub-optimal working conditions, and yet minimize the computational load. Results from simulations and field experiment show the benefits of using the proposed approach. In addition, the proposed framework can be applied to any other fusion scenarios between any type of sensors or platforms.
Impact of Imputation Strategies on Fairness in Machine Learning
Caton, Simon (School of Computer Science, University College Dublin) | Malisetty, Saiteja (University of Nebraska at Omaha) | Haas, Christian (Department of Strategy and Innovation, Vienna University of Economics and Business (WU))
Research on Fairness and Bias Mitigation in Machine Learning often uses a set of reference datasets for the design and evaluation of novel approaches or definitions. While these datasets are well structured and useful for the comparison of various approaches, they do not reflect that datasets commonly used in real-world applications can have missing values. When such missing values are encountered, the use of imputation strategies is commonplace. However, as imputation strategies potentially alter the distribution of data they can also affect the performance, and potentially the fairness, of the resulting predictions, a topic not yet well understood in the fairness literature. In this article, we investigate the impact of different imputation strategies on classical performance and fairness in classification settings. We find that the selected imputation strategy, along with other factors including the type of classification algorithm, can significantly affect performance and fairness outcomes. The results of our experiments indicate that the choice of imputation strategy is an important factor when considering fairness in Machine Learning. We also provide some insights and guidance for researchers to help navigate imputation approaches for fairness.
How to Evaluate Survival Analysis Models
Survival analysis encompasses a collection of statistical methods for describing time to event data. It originates from clinical studies, where physicians are mostly interested in assessing the effect of a new therapy on survival against a control group, or how certain features represent a risk of an adverse event in time. This post introduces the challenges related to survival analysis (censoring) and explains popular metrics to evaluate survival models, sharing practical Python examples along the way. Let us imagine to be clinical researchers. As we want to assess that the new treatment has a significant effect in preventing an adverse event (such as death), we monitor the patients of both groups for a certain period of time. This condition goes under the name of right censoring, and it is a common trait of survival analysis studies.
Confusing metrics around the Confusion Matrix
"If you can't measure it, you can't possibly improve it" . In the field of Machine Learning and Data Science, especially with statistical classification, a "Confusion Matrix" is often used to derive a bunch of metrics that can be examined to either improve the performance of a classifier model or to compare the performance of multiple models. Instead of starting from the mathematical formulae for the metrics, we will try to intuitively derive the formulae based on basic concepts. It is probably called "confusion" because it depicts how much confused the classifier was while doing its predictions -- some classes were correctly classified and some were not. The most important concept to understand before exploring any metric from the confusion matrix is the true meaning of the "positive" and the "negative" class in the context of the problem given to the classifier. The Positive class is the existence what we are trying to detect or predict.
An Investigation on Non-Invasive Brain-Computer Interfaces: Emotiv Epoc+ Neuroheadset and Its Effectiveness
Faruk, Md Jobair Hossain, Valero, Maria, Shahriar, Hossain
In this study, we illustrate the progress of BCI research and present scores of unveiled contemporary approaches. First, we explore a decoding natural speech approach that is designed to decode human speech directly from the human brain onto a digital screen introduced by Facebook Reality Lab and University of California San Francisco. Then, we study a recently presented visionary project to control the human brain using Brain-Machine Interfaces (BMI) approach. We also investigate well-known electroencephalography (EEG) based Emotiv Epoc+ Neuroheadset to identify six emotional parameters including engagement, excitement, focus, stress, relaxation, and interest using brain signals by experimenting the neuroheadset among three human subjects where we utilize two supervised learning classifiers, Naive Bayes and Linear Regression to show the accuracy and competency of the Epoc+ device and its associated applications in neurotechnological research. We present experimental studies and the demonstration indicates 69% and 62% improved accuracy for the aforementioned classifiers respectively in reading the performance matrices of the participants. We envision that non-invasive, insertable, and low-cost BCI approaches shall be the focal point for not only an alternative for patients with physical paralysis but also understanding the brain that would pave us to access and control the memories and brain somewhere very near.
Analyzing the impact of SARS-CoV-2 variants on respiratory sound signals
Bhattacharya, Debarpan, Dutta, Debottam, Sharma, Neeraj Kumar, Chetupalli, Srikanth Raj, Mote, Pravin, Ganapathy, Sriram, C, Chandrakiran, Nori, Sahiti, K, Suhail K, Gonuguntla, Sadhana, Alagesan, Murali
The COVID-19 outbreak resulted in multiple waves of infections that have been associated with different SARS-CoV-2 variants. Studies have reported differential impact of the variants on respiratory health of patients. We explore whether acoustic signals, collected from COVID-19 subjects, show computationally distinguishable acoustic patterns suggesting a possibility to predict the underlying virus variant. We analyze the Coswara dataset which is collected from three subject pools, namely, i) healthy, ii) COVID-19 subjects recorded during the delta variant dominant period, and iii) data from COVID-19 subjects recorded during the omicron surge. Our findings suggest that multiple sound categories, such as cough, breathing, and speech, indicate significant acoustic feature differences when comparing COVID-19 subjects with omicron and delta variants. The classification areas-under-the-curve are significantly above chance for differentiating subjects infected by omicron from those infected by delta. Using a score fusion from multiple sound categories, we obtained an area-under-the-curve of 89% and 52.4% sensitivity at 95% specificity. Additionally, a hierarchical three class approach was used to classify the acoustic data into healthy and COVID-19 positive, and further COVID-19 subjects into delta and omicron variants providing high level of 3-class classification accuracy. These results suggest new ways for designing sound based COVID-19 diagnosis approaches.
VGQ-CNN: Moving Beyond Fixed Cameras and Top-Grasps for Grasp Quality Prediction
Konrad, A., McDonald, J., Villing, R.
We present the Versatile Grasp Quality Convolutional Neural Network (VGQ-CNN), a grasp quality prediction network for 6-DOF grasps. VGQ-CNN can be used when evaluating grasps for objects seen from a wide range of camera poses or mobile robots without the need to retrain the network. By defining the grasp orientation explicitly as an input to the network, VGQ-CNN can evaluate 6-DOF grasp poses, moving beyond the 4-DOF grasps used in most image-based grasp evaluation methods like GQ-CNN. To train VGQ-CNN, we generate the new Versatile Grasp dataset (VG-dset) containing 6-DOF grasps observed from a wide range of camera poses. VGQ-CNN achieves a balanced accuracy of 82.1% on our test-split while generalising to a variety of camera poses. Meanwhile, it achieves competitive performance for overhead cameras and top-grasps with a balanced accuracy of 74.2% compared to GQ-CNN's 76.6%. We also propose a modified network architecture, FAST-VGQ-CNN, that speeds up inference using a shared encoder architecture and can make 128 grasp quality predictions in 12ms on a CPU. Code and data are available at https://aucoroboticsmu.github.io/vgq-cnn/.