Accuracy
Federated Learning of User Authentication Models
Hosseini, Hossein, Yun, Sungrack, Park, Hyunsin, Louizos, Christos, Soriaga, Joseph, Welling, Max
Machine learning-based User Authentication (UA) models have been widely deployed in smart devices. UA models are trained to map input data of different users to highly separable embedding vectors, which are then used to accept or reject new inputs at test time. Training UA models requires having direct access to the raw inputs and embedding vectors of users, both of which are privacy-sensitive information. In this paper, we propose Federated User Authentication (FedUA), a framework for privacy-preserving training of UA models. FedUA adopts federated learning framework to enable a group of users to jointly train a model without sharing the raw inputs. It also allows users to generate their embeddings as random binary vectors, so that, unlike the existing approach of constructing the spread out embeddings by the server, the embedding vectors are kept private as well. We show our method is privacy-preserving, scalable with number of users, and allows new users to be added to training without changing the output layer. Our experimental results on the VoxCeleb dataset for speaker verification shows our method reliably rejects data of unseen users at very high true positive rates.
The Data Science ABCs: A Whirlwind Tour of the Field
Batch Normalization is a layer commonly used in state-of-the-art neural networks. It takes inputs from the previous layer and normalizes it by ... The Area Under Curve metric represents the probability that a classifier will be more confident that a randomly chosen positive than a randomly chosen negative example is positive, in the case of binary classification. It is found on a ROC (Receiving Operator Characteristic) Curve, which plots the true positive rate against the false positive rate. Batch Normalization is a layer commonly used in state-of-the-art neural networks. It takes inputs from the previous layer and normalizes it by removing the mean and rescaling the standard deviation.
Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings
Test set accuracy for entity semantic type (STY) and semantic group (SG) classification are reported in Table 3. In accordance with the visualizations of semantic clusters (Figures 1 and 2), the KGE and NE methods perform significantly better than the corpus-based method (Cui2Vec). Notably, TransE and RotatE attain near-perfect accuracy for the broader semantic group classification (4 classes). ComplEx, DistMult, and SimplE perform slighty worse, Snomed2Vec slightly below them, and Cui2Vec falls behind by a significant margin. We see a greater discrepancy in relative performance by model type in semantic type classification (32 classes), in which more fine-grained semantic information is required.
Delta Schema Network in Model-based Reinforcement Learning
Gorodetskiy, Andrey, Shlychkova, Alexandra, Panov, Aleksandr I.
This work is devoted to unresolved problems of Artificial General Intelligence - the inefficiency of transfer learning. One of the mechanisms that are used to solve this problem in the area of reinforcement learning is a model-based approach. In the paper we are expanding the schema networks method which allows to extract the logical relationships between objects and actions from the environment data. We present algorithms for training a Delta Schema Network (DSN), predicting future states of the environment and planning actions that will lead to positive reward. DSN shows strong performance of transfer learning on the classic Atari game environment.
Are Ensemble Classifiers Powerful Enough for the Detection and Diagnosis of Intermediate-Severity Faults?
Jin, Baihong, Tan, Yingshui, Chen, Yuxin, Poolla, Kameshwar, Vincentelli, Alberto Sangiovanni
Intermediate-Severity (IS) faults present milder symptoms compared to severe faults, and are more difficult to detect and diagnose due to their close resemblance to normal operating conditions. The lack of IS fault examples in the training data can pose severe risks to Fault Detection and Diagnosis (FDD) methods that are built upon Machine Learning (ML) techniques, because these faults can be easily mistaken as normal operating conditions. Ensemble models are widely applied in ML and are considered promising methods for detecting out-of-distribution (OOD) data. We identify common pitfalls in these models through extensive experiments with several popular ensemble models on two real-world datasets. Then, we discuss how to design more effective ensemble models for detecting and diagnosing IS faults.
Transparency Tools for Fairness in AI (Luskin)
Chen, Mingliang, Shahverdi, Aria, Anderson, Sarah, Park, Se Yong, Zhang, Justin, Dachman-Soled, Dana, Lauter, Kristin, Wu, Min
We propose new tools for policy-makers to use when assessing and correcting fairness and bias in AI algorithms. The three tools are: - A new definition of fairness called "controlled fairness" with respect to choices of protected features and filters. The definition provides a simple test of fairness of an algorithm with respect to a dataset. This notion of fairness is suitable in cases where fairness is prioritized over accuracy, such as in cases where there is no "ground truth" data, only data labeled with past decisions (which may have been biased). - Algorithms for retraining a given classifier to achieve "controlled fairness" with respect to a choice of features and filters. Two algorithms are presented, implemented and tested. These algorithms require training two different models in two stages. We experiment with combinations of various types of models for the first and second stage and report on which combinations perform best in terms of fairness and accuracy. - Algorithms for adjusting model parameters to achieve a notion of fairness called "classification parity". This notion of fairness is suitable in cases where accuracy is prioritized. Two algorithms are presented, one which assumes that protected features are accessible to the model during testing, and one which assumes protected features are not accessible during testing. We evaluate our tools on three different publicly available datasets. We find that the tools are useful for understanding various dimensions of bias, and that in practice the algorithms are effective in starkly reducing a given observed bias when tested on new data.
Predicting the Accuracy of a Few-Shot Classifier
Bontonou, Myriam, Béthune, Louis, Gripon, Vincent
In the context of few-shot learning, one cannot measure the generalization ability of a trained classifier using validation sets, due to the small number of labeled samples. In this paper, we are interested in finding alternatives to answer the question: is my classifier generalizing well to previously unseen data? We first analyze the reasons for the variability of generalization performances. We then investigate the case of using transfer-based solutions, and consider three settings: i) supervised where we only have access to a few labeled samples, ii) semi-supervised where we have access to both a few labeled samples and a set of unlabeled samples and iii) unsupervised where we only have access to unlabeled samples. For each setting, we propose reasonable measures that we empirically demonstrate to be correlated with the generalization ability of considered classifiers. We also show that these simple measures can be used to predict generalization up to a certain confidence. We conduct our experiments on standard few-shot vision datasets.
Supervised machine learning techniques for data matching based on similarity metrics
Verschuuren, Pim, Palazzo, Serena, Powell, Tom, Sutton, Steve, Pilgrim, Alfred, Giannelli, Michele Faucci
Businesses, governmental bodies and NGO's have an ever-increasing amount of data at their disposal from which they try to extract valuable information. Often, this needs to be done not only accurately but also within a short time frame. Clean and consistent data is therefore crucial. Data matching is the field that tries to identify instances in data that refer to the same real-world entity. In this study, machine learning techniques are combined with string similarity functions to the field of data matching. A dataset of invoices from a variety of businesses and organizations was preprocessed with a grouping scheme to reduce pair dimensionality and a set of similarity functions was used to quantify similarity between invoice pairs. The resulting invoice pair dataset was then used to train and validate a neural network and a boosted decision tree. The performance was compared with a solution from FISCAL Technologies as a benchmark against currently available deduplication solutions. Both the neural network and boosted decision tree showed equal to better performance.
Learner's World
In continuation of my previous posts on various Performance measures for classifiers, here, I've explained the concept of single score measure namely; 'F - score'. In my previous posts, I had discussed four fundamental numbers, namely, true positive, true negative, false positive and false negative and eight basic ratios, namely, sensitivity(or recall or true positive rate) & specificity (or true negative rate), false positive rate (or type-I error) & false negative rates (or type-II error), positive predicted value (or precision) & negative predicted value, and false discovery rate (or q-value) & false omission rate. I had also discussed accuracy paradox, the relationship between various basic ratios and their trade-off to evaluate the performance of a classifier with examples. I'll be using the same confusion matrix for reference. Precision & Recall: First let's briefly revisit the understanding of'Precision (PPV) & Recall (sensitivity)'.
Determining Sequence of Image Processing Technique (IPT) to Detect Adversarial Attacks
Gupta, Kishor Datta, Akhtar, Zahid, Dasgupta, Dipankar
Developing secure machine learning models from adversarial examples is challenging as various methods are continually being developed to generate adversarial attacks. In this work, we propose an evolutionary approach to automatically determine Image Processing Techniques Sequence (IPTS) for detecting malicious inputs. Accordingly, we first used a diverse set of attack methods including adaptive attack methods (on our defense) to generate adversarial samples from the clean dataset. A detection framework based on a genetic algorithm (GA) is developed to find the optimal IPTS, where the optimality is estimated by different fitness measures such as Euclidean distance, entropy loss, average histogram, local binary pattern and loss functions. The "image difference" between the original and processed images is used to extract the features, which are then fed to a classification scheme in order to determine whether the input sample is adversarial or clean. This paper described our methodology and performed experiments using multiple data-sets tested with several adversarial attacks. For each attack-type and dataset, it generates unique IPTS. A set of IPTS selected dynamically in testing time which works as a filter for the adversarial attack. Our empirical experiments exhibited promising results indicating the approach can efficiently be used as processing for any AI model.