Performance Analysis
AUC-ROC Curve in Machine Learning Clearly Explained - Analytics Vidhya
You've built your machine learning model – so what's next? You need to evaluate it and validate how good (or bad) it is, so you can then decide on whether to implement it. That's where the AUC-ROC curve comes in. The name might be a mouthful, but it is just saying that we are calculating the "Area Under the Curve" (AUC) of "Receiver Characteristic Operator" (ROC). I have been in your shoes.
Automatic Personality Prediction; an Enhanced Method Using Ensemble Modeling
Ramezani, Majid, Feizi-Derakhshi, Mohammad-Reza, Balafar, Mohammad-Ali, Asgari-Chenaghlu, Meysam, Feizi-Derakhshi, Ali-Reza, Nikzad-Khasmakhi, Narjes, Ranjbar-Khadivi, Mehrdad, Jahanbakhsh-Nagadeh, Zoleikha, Zafarani-Moattar, Elnaz, Rahkar-Farshi, Taymaz
Human personality is significantly represented by those words which he/she uses in his/her speech or writing. As a consequence of spreading the information infrastructures (specifically the Internet and social media), human communications have reformed notably from face to face communication. Generally, Automatic Personality Prediction (or Perception) (APP) is the automated forecasting of the personality on different types of human generated/exchanged contents (like text, speech, image, video, etc.). The major objective of this study is to enhance the accuracy of APP from the text. To this end, we suggest five new APP methods including term frequency vector-based, ontology-based, enriched ontology-based, latent semantic analysis (LSA)-based, and deep learning-based (BiLSTM) methods. These methods as the base ones, contribute to each other to enhance the APP accuracy through ensemble modeling (stacking) based on a hierarchical attention network (HAN) as the meta-model. The results show that ensemble modeling enhances the accuracy of APP.
Predictive Value Generalization Bounds
Vemuri, Keshav, Srebro, Nathan
In this paper, we study a bi-criterion framework for assessing scoring functions in the context of binary classification. The positive and negative predictive values (ppv and npv, respectively) are conditional probabilities of the true label matching a classifier's predicted label. The usual classification error rate is a linear combination of these probabilities, and therefore, concentration inequalities for the error rate do not yield confidence intervals for the two separate predictive values. We study generalization properties of scoring functions with respect to predictive values by deriving new distribution-free large deviation and uniform convergence bounds. The latter bound is stated in terms of a measure of function class complexity that we call the order coefficient; we relate this combinatorial quantity to the VC-subgraph dimension.
Inferring proximity from Bluetooth Low Energy RSSI with Unscented Kalman Smoothers
Lovett, Tom, Briers, Mark, Charalambides, Marcos, Jersakova, Radka, Lomax, James, Holmes, Chris
The Covid-19 pandemic has resulted in a variety of approaches for managing infection outbreaks in international populations. One example is mobile phone applications, which attempt to alert infected individuals and their contacts by automatically inferring two key components of infection risk: the proximity to an individual who may be infected, and the duration of proximity. The former component, proximity, relies on Bluetooth Low Energy (BLE) Received Signal Strength Indicator(RSSI) as a distance sensor, and this has been shown to be problematic; not least because of unpredictable variations caused by different device types, device location on-body, device orientation, the local environment and the general noise associated with radio frequency propagation. In this paper, we present an approach that infers posterior probabilities over distance given sequences of RSSI values. Using a single-dimensional Unscented Kalman Smoother (UKS) for non-linear state space modelling, we outline several Gaussian process observation transforms, including: a generative model that directly captures sources of variation; and a discriminative model that learns a suitable observation function from training data using both distance and infection risk as optimisation objective functions. Our results show that good risk prediction can be achieved in $\mathcal{O}(n)$ time on real-world data sets, with the UKS outperforming more traditional classification methods learned from the same training data.
Federated Learning of User Authentication Models
Hosseini, Hossein, Yun, Sungrack, Park, Hyunsin, Louizos, Christos, Soriaga, Joseph, Welling, Max
Machine learning-based User Authentication (UA) models have been widely deployed in smart devices. UA models are trained to map input data of different users to highly separable embedding vectors, which are then used to accept or reject new inputs at test time. Training UA models requires having direct access to the raw inputs and embedding vectors of users, both of which are privacy-sensitive information. In this paper, we propose Federated User Authentication (FedUA), a framework for privacy-preserving training of UA models. FedUA adopts federated learning framework to enable a group of users to jointly train a model without sharing the raw inputs. It also allows users to generate their embeddings as random binary vectors, so that, unlike the existing approach of constructing the spread out embeddings by the server, the embedding vectors are kept private as well. We show our method is privacy-preserving, scalable with number of users, and allows new users to be added to training without changing the output layer. Our experimental results on the VoxCeleb dataset for speaker verification shows our method reliably rejects data of unseen users at very high true positive rates.
The Data Science ABCs: A Whirlwind Tour of the Field
Batch Normalization is a layer commonly used in state-of-the-art neural networks. It takes inputs from the previous layer and normalizes it by ... The Area Under Curve metric represents the probability that a classifier will be more confident that a randomly chosen positive than a randomly chosen negative example is positive, in the case of binary classification. It is found on a ROC (Receiving Operator Characteristic) Curve, which plots the true positive rate against the false positive rate. Batch Normalization is a layer commonly used in state-of-the-art neural networks. It takes inputs from the previous layer and normalizes it by removing the mean and rescaling the standard deviation.
Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings
Test set accuracy for entity semantic type (STY) and semantic group (SG) classification are reported in Table 3. In accordance with the visualizations of semantic clusters (Figures 1 and 2), the KGE and NE methods perform significantly better than the corpus-based method (Cui2Vec). Notably, TransE and RotatE attain near-perfect accuracy for the broader semantic group classification (4 classes). ComplEx, DistMult, and SimplE perform slighty worse, Snomed2Vec slightly below them, and Cui2Vec falls behind by a significant margin. We see a greater discrepancy in relative performance by model type in semantic type classification (32 classes), in which more fine-grained semantic information is required.
Delta Schema Network in Model-based Reinforcement Learning
Gorodetskiy, Andrey, Shlychkova, Alexandra, Panov, Aleksandr I.
This work is devoted to unresolved problems of Artificial General Intelligence - the inefficiency of transfer learning. One of the mechanisms that are used to solve this problem in the area of reinforcement learning is a model-based approach. In the paper we are expanding the schema networks method which allows to extract the logical relationships between objects and actions from the environment data. We present algorithms for training a Delta Schema Network (DSN), predicting future states of the environment and planning actions that will lead to positive reward. DSN shows strong performance of transfer learning on the classic Atari game environment.
Are Ensemble Classifiers Powerful Enough for the Detection and Diagnosis of Intermediate-Severity Faults?
Jin, Baihong, Tan, Yingshui, Chen, Yuxin, Poolla, Kameshwar, Vincentelli, Alberto Sangiovanni
Intermediate-Severity (IS) faults present milder symptoms compared to severe faults, and are more difficult to detect and diagnose due to their close resemblance to normal operating conditions. The lack of IS fault examples in the training data can pose severe risks to Fault Detection and Diagnosis (FDD) methods that are built upon Machine Learning (ML) techniques, because these faults can be easily mistaken as normal operating conditions. Ensemble models are widely applied in ML and are considered promising methods for detecting out-of-distribution (OOD) data. We identify common pitfalls in these models through extensive experiments with several popular ensemble models on two real-world datasets. Then, we discuss how to design more effective ensemble models for detecting and diagnosing IS faults.
Transparency Tools for Fairness in AI (Luskin)
Chen, Mingliang, Shahverdi, Aria, Anderson, Sarah, Park, Se Yong, Zhang, Justin, Dachman-Soled, Dana, Lauter, Kristin, Wu, Min
We propose new tools for policy-makers to use when assessing and correcting fairness and bias in AI algorithms. The three tools are: - A new definition of fairness called "controlled fairness" with respect to choices of protected features and filters. The definition provides a simple test of fairness of an algorithm with respect to a dataset. This notion of fairness is suitable in cases where fairness is prioritized over accuracy, such as in cases where there is no "ground truth" data, only data labeled with past decisions (which may have been biased). - Algorithms for retraining a given classifier to achieve "controlled fairness" with respect to a choice of features and filters. Two algorithms are presented, implemented and tested. These algorithms require training two different models in two stages. We experiment with combinations of various types of models for the first and second stage and report on which combinations perform best in terms of fairness and accuracy. - Algorithms for adjusting model parameters to achieve a notion of fairness called "classification parity". This notion of fairness is suitable in cases where accuracy is prioritized. Two algorithms are presented, one which assumes that protected features are accessible to the model during testing, and one which assumes protected features are not accessible during testing. We evaluate our tools on three different publicly available datasets. We find that the tools are useful for understanding various dimensions of bias, and that in practice the algorithms are effective in starkly reducing a given observed bias when tested on new data.