Accuracy
Neo: Generalizing Confusion Matrix Visualization to Hierarchical and Multi-Output Labels
Görtler, Jochen, Hohman, Fred, Moritz, Dominik, Wongsuphasawat, Kanit, Ren, Donghao, Nair, Rahul, Kirchner, Marc, Patel, Kayur
Abstract--The confusion matrix, a ubiquitous visualization for helping people evaluate machine learning models, is a tabular layout that compares predicted class labels against actual class labels over all data instances. We conduct formative research with machine learning practitioners at a large technology company and find that conventional confusion matrices do not support more complex data-structures found in modern-day applications, such as hierarchical and multi-output labels. To express such variations of confusion matrices, we design an algebra that models confusion matrices as probability distributions. 's utility with three case studies that help people better understand model performance and reveal hidden confusions. Machine learning is a complex, iterative design and development practice predicted class labels (synonymously, these can be flipped via a matrix [4, 24], where the goal is to generate a learned model that generalizes transpose). These visualizations are introduced in many machine to unseen data inputs. One critical step is model evaluation, testing learning courses and are simultaneously used in practice to show what and inspecting a model's performance on held-out test sets of data with pairs of classes a model confuses. Succinctly, confusion matrices are known labels. Confusion matrices show a visual proxy A ubiquitous visualization used for model evaluation, particularly for accuracy (e.g., entries on the diagonal of the matrix), which alone for classification models, is the confusion matrix: a tabular layout that has been shown to be insufficient for many evaluations [39]. Furthermore, compares a predicted class label against the actual class label for each the diagonal of a confusion matrix often contains many more class over all data instances.
How to draw ROC curve for a multi-class dataset ?
Say I have a multi-class dataset and would like to draw its associated ROC curve for one of its classes (e.g. SkLearn has a handy implementation that calculates the tpr and fpr and another function that generates the auc for you. You can just apply this to your data by treating each class on its own (all other data being negative) by looping through each class. The code below was inspired by the scikit-learn page on this topic itself. For this exercise, I will generate some synthetic sample data and for predictions as well I will create a vector from random uniform distribution.
Signal to Noise Ratio Loss Function
Ghobadzadeh, Ali, Lashkari, Amir
This work proposes a new loss function targeting classification problems, utilizing a source of information overlooked by cross entropy loss. First, we derive a series of the tightest upper and lower bounds for the probability of a random variable in a given interval. Second, a lower bound is proposed for the probability of a true positive for a parametric classification problem, where the form of probability density function (pdf) of data is given. A closed form for finding the optimal function of unknowns is derived to maximize the probability of true positives. Finally, for the case that the pdf of data is unknown, we apply the proposed boundaries to find the lower bound of the probability of true positives and upper bound of the probability of false positives and optimize them using a loss function which is given by combining the boundaries. We demonstrate that the resultant loss function is a function of the signal to noise ratio both within and across logits. We empirically evaluate our proposals to show their benefit for classification problems.
UC creates recommendations for responsible use of artificial intelligence
The University of California has created recommendations to create a path toward the responsible use of artificial intelligence in future UC endeavors. UC's increasing dependence on the use of AI has increased its overall productivity as an institution, according to the UC Office of the President, or UCOP. However, with the implementation of AI, there is also potential for problems to arise. To combat this, former UC President Janet Napolitano and current president Michael Drake created the Presidential Working Group on Artificial Intelligence, or the Working Group, in August 2020. The Working Group's final report noted that the group consists of 32 faculty and staff from all 10 UC campuses and an additional number of representatives from UC Legal and the Office of Ethics, Compliance and Audit Services, among other groups.
Uncertainty aware anomaly detection to predict errant beam pulses in the SNS accelerator
Blokland, Willem, Ramuhalli, Pradeep, Peters, Charles, Yucesan, Yigit, Zhukov, Alexander, Schram, Malachi, Rajput, Kishansingh, Jeske, Torri
High-power particle accelerators are complex machines with thousands of pieces of equipmentthat are frequently running at the cutting edge of technology. In order to improve the day-to-dayoperations and maximize the delivery of the science, new analytical techniques are being exploredfor anomaly detection, classification, and prognostications. As such, we describe the applicationof an uncertainty aware Machine Learning method, the Siamese neural network model, to predictupcoming errant beam pulses using the data from a single monitoring device. By predicting theupcoming failure, we can stop the accelerator before damage occurs. We describe the acceleratoroperation, related Machine Learning research, the prediction performance required to abort beamwhile maintaining operations, the monitoring device and its data, and the Siamese method andits results. These results show that the researched method can be applied to improve acceleratoroperations.
Gaussian Graphical Model Selection for Huge Data via Minipatch Learning
Yao, Tianyi, Wang, Minjie, Allen, Genevera I.
Gaussian graphical models are essential unsupervised learning techniques to estimate conditional dependence relationships between sets of nodes. While graphical model selection is a well-studied problem with many popular techniques, there are typically three key practical challenges: i) many existing methods become computationally intractable in huge-data settings with tens of thousands of nodes; ii) the need for separate data-driven tuning hyperparameter selection procedures considerably adds to the computational burden; iii) the statistical accuracy of selected edges often deteriorates as the dimension and/or the complexity of the underlying graph structures increase. We tackle these problems by proposing the Minipatch Graph (MPGraph) estimator. Our approach builds upon insights from the latent variable graphical model problem and utilizes ensembles of thresholded graph estimators fit to tiny, random subsets of both the observations and the nodes, termed minipatches. As estimates are fit on small problems, our approach is computationally fast with integrated stability-based hyperparameter tuning. Additionally, we prove that under certain conditions our MPGraph algorithm achieves finite-sample graph selection consistency. We compare our approach to state-of-the-art computational approaches to Gaussian graphical model selection including the BigQUIC algorithm, and empirically demonstrate that our approach is not only more accurate but also extensively faster for huge graph selection problems.
PROVES: Establishing Image Provenance using Semantic Signatures
Xie, Mingyang, Kulshrestha, Manav, Wang, Shaojie, Yang, Jinghan, Chakrabarti, Ayan, Zhang, Ning, Vorobeychik, Yevgeniy
Modern AI tools, such as generative adversarial networks, have transformed our ability to create and modify visual data with photorealistic results. However, one of the deleterious side-effects of these advances is the emergence of nefarious uses in manipulating information in visual data, such as through the use of deep fakes. We propose a novel architecture for preserving the provenance of semantic information in images to make them less susceptible to deep fake attacks. Our architecture includes semantic signing and verification steps. We apply this architecture to verifying two types of semantic information: individual identities (faces) and whether the photo was taken indoors or outdoors. Verification accounts for a collection of common image transformation, such as translation, scaling, cropping, and small rotations, and rejects adversarial transformations, such as adversarially perturbed or, in the case of face verification, swapped faces. Experiments demonstrate that in the case of provenance of faces in an image, our approach is robust to black-box adversarial transformations (which are rejected) as well as benign transformations (which are accepted), with few false negatives and false positives. Background verification, on the other hand, is susceptible to black-box adversarial examples, but becomes significantly more robust after adversarial training.
Generalized Out-of-Distribution Detection: A Survey
Yang, Jingkang, Zhou, Kaiyang, Li, Yixuan, Liu, Ziwei
Out-of-distribution (OOD) detection is critical to ensuring the reliability and safety of machine learning systems. For instance, in autonomous driving, we would like the driving system to issue an alert and hand over the control to humans when it detects unusual scenes or objects that it has never seen before and cannot make a safe decision. This problem first emerged in 2017 and since then has received increasing attention from the research community, leading to a plethora of methods developed, ranging from classification-based to density-based to distance-based ones. Meanwhile, several other problems are closely related to OOD detection in terms of motivation and methodology. These include anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD). Despite having different definitions and problem settings, these problems often confuse readers and practitioners, and as a result, some existing studies misuse terms. In this survey, we first present a generic framework called generalized OOD detection, which encompasses the five aforementioned problems, i.e., AD, ND, OSR, OOD detection, and OD. Under our framework, these five problems can be seen as special cases or sub-tasks, and are easier to distinguish. Then, we conduct a thorough review of each of the five areas by summarizing their recent technical developments. We conclude this survey with open challenges and potential research directions.
RoQNN: Noise-Aware Training for Robust Quantum Neural Networks
Wang, Hanrui, Gu, Jiaqi, Ding, Yongshan, Li, Zirui, Chong, Frederic T., Pan, David Z., Han, Song
Quantum Neural Network (QNN) is a promising application towards quantum advantage on near-term quantum hardware. However, due to the large quantum noises (errors), the performance of QNN models has a severe degradation on real quantum devices. For example, the accuracy gap between noise-free simulation and noisy results on IBMQ-Yorktown for MNIST-4 classification is over 60%. Existing noise mitigation methods are general ones without leveraging unique characteristics of QNN and are only applicable to inference; on the other hand, existing QNN work does not consider noise effect. To this end, we present RoQNN, a QNN-specific framework to perform noise-aware optimizations in both training and inference stages to improve robustness. We analytically deduct and experimentally observe that the effect of quantum noise to QNN measurement outcome is a linear map from noise-free outcome with a scaling and a shift factor. Motivated by that, we propose post-measurement normalization to mitigate the feature distribution differences between noise-free and noisy scenarios. Furthermore, to improve the robustness against noise, we propose noise injection to the training process by inserting quantum error gates to QNN according to realistic noise models of quantum hardware. Finally, post-measurement quantization is introduced to quantize the measurement outcomes to discrete values, achieving the denoising effect. Extensive experiments on 8 classification tasks using 6 quantum devices demonstrate that RoQNN improves accuracy by up to 43%, and achieves over 94% 2-class, 80% 4-class, and 34% 10-class MNIST classification accuracy measured on real quantum computers. Quantum Computing (QC) is a new computational paradigm that can be exponentially faster than classical counterparts in various domains such as cryptography (Shor, 1999), database search (Grover, 1996), and chemistry (Kandala et al., 2017; Peruzzo et al., 2014; Cao et al., 2019). Quantum Machine Learning (QML) aims to leverage QC techniques to solve machine learning tasks and achieve much higher efficiency. Right: Due to the errors, QNN models suffer from severe accuracy drops. Different devices have various error magnitudes, leading to distinct accuracy.
Power Transformer Fault Diagnosis with Intrinsic Time-scale Decomposition and XGBoost Classifier
Sami, Shoaib Meraj, Bhuiyan, Mohammed Imamul Hassan
An intrinsic time-scale decomposition (ITD) based method for power transformer fault diagnosis is proposed. Dissolved gas analysis (DGA) parameters are ranked according to their skewness, and then ITD based features extraction is performed. An optimal set of PRC features are determined by an XGBoost classifier. For classification purpose, an XGBoost classifier is used to the optimal PRC features set. The proposed method's performance in classification is studied using publicly available DGA data of 376 power transformers and employing an XGBoost classifier. The Proposed method achieves more than 95% accuracy and high sensitivity and F1-score, better than conventional methods and some recent machine learning-based fault diagnosis approaches. Moreover, it gives better Cohen Kappa and F1-score as compared to the recently introduced EMD-based hierarchical technique for fault diagnosis in power transformers.