Accuracy
Random Actions vs Random Policies: Bootstrapping Model-Based Direct Policy Search
Hanna, Elias, Coninx, Alex, Doncieux, Stéphane
This paper studies the impact of the initial data gathering method on the subsequent learning of a dynamics model. Dynamics models approximate the true transition function of a given task, in order to perform policy search directly on the model rather than on the costly real system. This study aims to determine how to bootstrap a model as efficiently as possible, by comparing initialization methods employed in two different policy search frameworks in the literature. The study focuses on the model performance under the episode-based framework of Evolutionary methods using probabilistic ensembles. Experimental results show that various task-dependant factors can be detrimental to each method, suggesting to explore hybrid approaches.
Prediction of drug effectiveness in rheumatoid arthritis patients based on machine learning algorithms
Chen, Shengjia, Gupta, Nikunj, Galbraith, Woodward B., Shah, Valay, Cirrone, Jacopo
Rheumatoid arthritis (RA) is an autoimmune condition caused when patients' immune system mistakenly targets their own tissue. Machine learning (ML) has the potential to identify patterns in patient electronic health records (EHR) to forecast the best clinical treatment to improve patient outcomes. This study introduced a Drug Response Prediction (DRP) framework with two main goals: 1) design a data processing pipeline to extract information from tabular clinical data, and then preprocess it for functional use, and 2) predict RA patient's responses to drugs and evaluate classification models' performance. We propose a novel two-stage ML framework based on European Alliance of Associations for Rheumatology (EULAR) criteria cutoffs to model drug effectiveness. Our model Stacked-Ensemble DRP was developed and cross-validated using data from 425 RA patients. The evaluation used a subset of 124 patients (30%) from the same data source. In the evaluation of the test set, two-stage DRP leads to improved classification accuracy over other end-to-end classification models for binary classification. Our proposed method provides a complete pipeline to predict disease activity scores and identify the group that does not respond well to anti-TNF treatments, thus showing promise in supporting clinical decisions based on EHR information.
A GA-like Dynamic Probability Method With Mutual Information for Feature Selection
Wang, Gaoshuai, Lauri, Fabrice, Hassani, Amir Hajjam El
Feature selection plays a vital role in promoting the classifier's performance. However, current methods ineffectively distinguish the complex interaction in the selected features. To further remove these hidden negative interactions, we propose a GA-like dynamic probability (GADP) method with mutual information which has a two-layer structure. The first layer applies the mutual information method to obtain a primary feature subset. The GA-like dynamic probability algorithm, as the second layer, mines more supportive features based on the former candidate features. Essentially, the GA-like method is one of the population-based algorithms so its work mechanism is similar to the GA. Different from the popular works which frequently focus on improving GA's operators for enhancing the search ability and lowering the converge time, we boldly abandon GA's operators and employ the dynamic probability that relies on the performance of each chromosome to determine feature selection in the new generation. The dynamic probability mechanism significantly reduces the parameter number in GA that making it easy to use. As each gene's probability is independent, the chromosome variety in GADP is more notable than in traditional GA, which ensures GADP has a wider search space and selects relevant features more effectively and accurately. To verify our method's superiority, we evaluate our method under multiple conditions on 15 datasets. The results demonstrate the outperformance of the proposed method. Generally, it has the best accuracy. Further, we also compare the proposed model to the popular heuristic methods like POS, FPA, and WOA. Our model still owns advantages over them.
Are Representations Built from the Ground Up? An Empirical Examination of Local Composition in Language Models
Compositionality, the phenomenon where the meaning of a phrase can be derived from its constituent parts, is a hallmark of human language. At the same time, many phrases are non-compositional, carrying a meaning beyond that of each part in isolation. Representing both of these types of phrases is critical for language understanding, but it is an open question whether modern language models (LMs) learn to do so; in this work we examine this question. We first formulate a problem of predicting the LM-internal representations of longer phrases given those of their constituents. We find that the representation of a parent phrase can be predicted with some accuracy given an affine transformation of its children. While we would expect the predictive accuracy to correlate with human judgments of semantic compositionality, we find this is largely not the case, indicating that LMs may not accurately distinguish between compositional and non-compositional phrases. We perform a variety of analyses, shedding light on when different varieties of LMs do and do not generate compositional representations, and discuss implications for future modeling work.
Choose Your Lenses: Flaws in Gender Bias Evaluation
Orgad, Hadas, Belinkov, Yonatan
Considerable efforts to measure and mitigate gender bias in recent years have led to the introduction of an abundance of tasks, datasets, and metrics used in this vein. In this position paper, we assess the current paradigm of gender bias evaluation and identify several flaws in it. First, we highlight the importance of extrinsic bias metrics that measure how a model's performance on some task is affected by gender, as opposed to intrinsic evaluations of model representations, which are less strongly connected to specific harms to people interacting with systems. We find that only a few extrinsic metrics are measured in most studies, although more can be measured. Second, we find that datasets and metrics are often coupled, and discuss how their coupling hinders the ability to obtain reliable conclusions, and how one may decouple them. We then investigate how the choice of the dataset and its composition, as well as the choice of the metric, affect bias measurement, finding significant variations across each of them. Finally, we propose several guidelines for more reliable gender bias evaluation.
Are You Stealing My Model? Sample Correlation for Fingerprinting Deep Neural Networks
Guan, Jiyang, Liang, Jian, He, Ran
An off-the-shelf model as a commercial service could be stolen by model stealing attacks, posing great threats to the rights of the model owner. Model fingerprinting aims to verify whether a suspect model is stolen from the victim model, which gains more and more attention nowadays. Previous methods always leverage the transferable adversarial examples as the model fingerprint, which is sensitive to adversarial defense or transfer learning scenarios. To address this issue, we consider the pairwise relationship between samples instead and propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC). Specifically, we present SAC-w that selects wrongly classified normal samples as model inputs and calculates the mean correlation among their model outputs. To reduce the training time, we further develop SAC-m that selects CutMix Augmented samples as model inputs, without the need for training the surrogate models or generating adversarial examples. Extensive results validate that SAC successfully defends against various model stealing attacks, even including adversarial training or transfer learning, and detects the stolen models with the best performance in terms of AUC across different datasets and model architectures. The codes are available at https://github.com/guanjiyang/SAC.
A Methodology for the Prediction of Drug Target Interaction using CDK Descriptors
Liyaqat, Tanya, Ahmad, Tanvir, Saxena, Chandni
Detecting probable Drug Target Interaction (DTI) is a critical task in drug discovery. Conventional DTI studies are expensive, labor-intensive, and take a lot of time, hence there are significant reasons to construct useful computational techniques that may successfully anticipate possible DTIs. Although certain methods have been developed for this cause, numerous interactions are yet to be discovered, and prediction accuracy is still low. To meet these challenges, we propose a DTI prediction model built on molecular structure of drugs and sequence of target proteins. In the proposed model, we use Simplified Molecular-Input Line-Entry System (SMILES) to create CDK descriptors, Molecular AC-Cess System (MACCS) fingerprints, Electrotopological state (Estate) fingerprints and amino-acid sequences of targets to get Pseudo Amino Acid Composition (PseAAC). We target to evaluate performance of DTI prediction models using CDK descriptors. For comparison, we use benchmark data and evaluate models' performance on two widely used fingerprints, MACCS fingerprints and Estate fingerprints. The evaluation of performances shows that CDK descriptors are superior at predicting DTIs. The proposed method also outperforms other previously published techniques significantly.
Improving Data Quality with Training Dynamics of Gradient Boosting Decision Trees
Ponti, Moacir Antonelli, Oliveira, Lucas de Angelis, Román, Juan Martín, Argerich, Luis
Real world datasets contain incorrectly labeled instances that hamper the performance of the model and, in particular, the ability to generalize out of distribution. Also, each example might have different contribution towards learning. This motivates studies to better understanding of the role of data instances with respect to their contribution in good metrics in models. In this paper we propose a method based on metrics computed from training dynamics of Gradient Boosting Decision Trees (GBDTs) to assess the behavior of each training example. We focus on datasets containing mostly tabular or structured data, for which the use of Decision Trees ensembles are still the state-of-the-art in terms of performance. We show results on detecting noisy labels in order to either remove them, improving models' metrics in synthetic and real datasets, as well as a productive dataset. Our methods achieved the best results overall when compared with confident learning and heuristics.
Inferring Implicit Relations in Complex Questions with Language Models
Katz, Uri, Geva, Mor, Berant, Jonathan
A prominent challenge for modern language understanding systems is the ability to answer implicit reasoning questions, where the required reasoning steps for answering the question are not mentioned in the text explicitly. In this work, we investigate why current models struggle with implicit reasoning question answering (QA) tasks, by decoupling inference of reasoning steps from their execution. We define a new task of implicit relation inference and construct a benchmark, IMPLICITRELATIONS, where given a question, a model should output a list of concept-relation pairs, where the relations describe the implicit reasoning steps required for answering the question. Using IMPLICITRELATIONS, we evaluate models from the GPT-3 family and find that, while these models struggle on the implicit reasoning QA task, they often succeed at inferring implicit relations. This suggests that the challenge in implicit reasoning questions does not stem from the need to plan a reasoning strategy alone, but to do it while also retrieving and reasoning over relevant information.
Representative Teacher Keys for Knowledge Distillation Model Compression Based on Attention Mechanism for Image Classification
Yang, Jun-Teng, Kao, Sheng-Che, Huang, Scott C. -H.
With the improvement of AI chips (e.g., GPU, TPU, and NPU) and the fast development of the Internet of Things (IoT), some robust deep neural networks (DNNs) are usually composed of millions or even hundreds of millions of parameters. Such a large model may not be suitable for directly deploying on low computation and low capacity units (e.g., edge devices). Knowledge distillation (KD) has recently been recognized as a powerful model compression method to decrease the model parameters effectively. The central concept of KD is to extract useful information from the feature maps of a large model (i.e., teacher model) as a reference to successfully train a small model (i.e., student model) in which the model size is much smaller than the teacher one. Although many KD methods have been proposed to utilize the information from the feature maps of intermediate layers in the teacher model, most did not consider the similarity of feature maps between the teacher model and the student model. As a result, it may make the student model learn useless information. Inspired by the attention mechanism, we propose a novel KD method called representative teacher key (RTK) that not only considers the similarity of feature maps but also filters out the useless information to improve the performance of the target student model. In the experiments, we validate our proposed method with several backbone networks (e.g., ResNet and WideResNet) and datasets (e.g., CIFAR10, CIFAR100, SVHN, and CINIC10). The results show that our proposed RTK can effectively improve the classification accuracy of the state-of-the-art attention-based KD method.