Accuracy
Comparative Sentiment Analysis of App Reviews
Ranjan, Sakshi, Mishra, Subhankar
Google app market captures the school of thought of users via ratings and text reviews. The critique's viewpoint regarding an app is proportional to their satisfaction level. Consequently, this helps other users to gain insights before downloading or purchasing the apps. The potential information from the reviews can't be extracted manually, due to its exponential growth. Sentiment analysis, by machine learning algorithms employing NLP, is used to explicitly uncover and interpret the emotions. This study aims to perform the sentiment classification of the app reviews and identify the university students' behavior towards the app market. We applied machine learning algorithms using the TF-IDF text representation scheme and the performance was evaluated on the ensemble learning method. Our model was trained on Google reviews and tested on students' reviews. SVM recorded the maximum accuracy(93.37\%), F-score(0.88) on tri-gram + TF-IDF scheme. Bagging enhanced the performance of LR and NB with accuracy of 87.80\% and 85.5\% respectively.
Constrained regret minimization for multi-criterion multi-armed bandits
Kagrecha, Anmol, Nair, Jayakrishnan, Jagannathan, Krishna
We consider a stochastic multi-armed bandit setting and study the problem of regret minimization over a given time horizon, subject to a risk constraint. Each arm is associated with an unknown cost/loss distribution. The learning agent is characterized by a risk-appetite that she is willing to tolerate, which we model using a pre-specified upper bound on the Conditional Value at Risk (CVaR). An optimal arm is one that minimizes the expected loss, among those arms that satisfy the CVaR constraint. The agent is interested in minimizing the number of pulls of suboptimal arms, including the ones that are 'too risky.' For this problem, we propose a Risk-Constrained Lower Confidence Bound (RC-LCB) algorithm, that guarantees logarithmic regret, i.e., the average number of plays of all non-optimal arms is at most logarithmic in the horizon. The algorithm also outputs a boolean flag that correctly identifies with high probability, whether the given instance was feasible/infeasible with respect to the risk constraint. We prove lower bounds on the performance of any risk-constrained regret minimization algorithm and establish a fundamental trade-off between regret minimization and feasibility identification. The proposed algorithm and analyses can be readily generalized to solve constrained multi-criterion optimization problems in the bandits setting.
Towards Early Diagnosis of Epilepsy from EEG Data
Lu, Diyuan, Bauer, Sebastian, Neubert, Valentin, Costard, Lara Sophie, Rosenow, Felix, Triesch, Jochen
Epilepsy is one of the most common neurological disorders, affecting about 1% of the population at all ages. Detecting the development of epilepsy, i.e., epileptogenesis (EPG), before any seizures occur could allow for early interventions and potentially more effective treatments. Here, we investigate if modern machine learning (ML) techniques can detect EPG from intra-cranial electroencephalography (EEG) recordings prior to the occurrence of any seizures. For this we use a rodent model of epilepsy where EPG is triggered by electrical stimulation of the brain. We propose a ML framework for EPG identification, which combines a deep convolutional neural network (CNN) with a prediction aggregation method to obtain the final classification decision. Specifically, the neural network is trained to distinguish five second segments of EEG recordings taken from either the pre-stimulation period or the post-stimulation period. Due to the gradual development of epilepsy, there is enormous overlap of the EEG patterns before and after the stimulation. Hence, a prediction aggregation process is introduced, which pools predictions over a longer period. By aggregating predictions over one hour, our approach achieves an area under the curve (AUC) of 0.99 on the EPG detection task. This demonstrates the feasibility of EPG prediction from EEG recordings.
Deep Neural Networks for the Sequential Probability Ratio Test on Non-i.i.d. Data Series
Ebihara, Akinori F., Miyagawa, Taiki, Sakurai, Kazuyuki, Imaoka, Hitoshi
Classifying sequential data as early as and as accurately as possible is a challenging yet critical problem, especially when a sampling cost is high. One algorithm that achieves this goal is the sequential probability ratio test (SPRT), which is known as Bayes-optimal: it can keep the expected number of data samples as small as possible, given the desired error upper-bound. The SPRT has recently been found to be the best model that explains the activities of the neurons in the primate parietal cortex that are thought to mediate our complex decision-making processes. However, the original SPRT makes two critical assumptions that limit its application in real-world scenarios: (i) samples are independently and identically distributed, and (ii) the likelihood of the data being derived from each class can be calculated precisely. Here, we propose the SPRT-TANDEM, a deep neural network-based SPRT algorithm that overcomes the above two obstacles. The SPRT-TANDEM estimates the log-likelihood ratio of two alternative hypotheses by leveraging a novel Loss function for Log-Likelihood Ratio estimation (LLLR), while allowing for correlations up to $N (\in \mathbb{N})$ preceding samples. In tests on one original and two public video databases, Nosaic MNIST, UCF101, and SiW, the SPRT-TANDEM achieves statistically significantly better classification accuracy than other baseline classifiers, with a smaller number of data samples. The code and Nosaic MNIST are publicly available at https://github.com/TaikiMiyagawa/SPRT-TANDEM.
Fine-Tuning DARTS for Image Classification
Tanveer, Muhammad Suhaib, Khan, Muhammad Umar Karim, Kyung, Chong-Min
Neural Architecture Search (NAS) has gained attraction due to superior classification performance. Differential Architecture Search (DARTS) is a computationally light method. To limit computational resources DARTS makes numerous approximations. These approximations result in inferior performance. We propose to fine-tune DARTS using fixed operations as they are independent of these approximations. Our method offers a good trade-off between the number of parameters and classification accuracy. Our approach improves the top-1 accuracy on Fashion-MNIST, CompCars, and MIO-TCD datasets by 0.56%, 0.50%, and 0.39%, respectively compared to the state-of-the-art approaches. Our approach performs better than DARTS, improving the accuracy by 0.28%, 1.64%, 0.34%, 4.5%, and 3.27% compared to DARTS, on CIFAR-10, CIFAR-100, Fashion-MNIST, CompCars, and MIO-TCD datasets, respectively.
MCRapper: Monte-Carlo Rademacher Averages for Poset Families and Approximate Pattern Mining
Pellegrina, Leonardo, Cousins, Cyrus, Vandin, Fabio, Riondato, Matteo
We present MCRapper, an algorithm for efficient computation of Monte-Carlo Empirical Rademacher Averages (MCERA) for families of functions exhibiting poset (e.g., lattice) structure, such as those that arise in many pattern mining tasks. The MCERA allows us to compute upper bounds to the maximum deviation of sample means from their expectations, thus it can be used to find both statistically-significant functions (i.e., patterns) when the available data is seen as a sample from an unknown distribution, and approximations of collections of high-expectation functions (e.g., frequent patterns) when the available data is a small sample from a large dataset. This feature is a strong improvement over previously proposed solutions that could only achieve one of the two. MCRapper uses upper bounds to the discrepancy of the functions to efficiently explore and prune the search space, a technique borrowed from pattern mining itself. To show the practical use of MCRapper, we employ it to develop an algorithm TFP-R for the task of True Frequent Pattern (TFP) mining. TFP-R gives guarantees on the probability of including any false positives (precision) and exhibits higher statistical power (recall) than existing methods offering the same guarantees. We evaluate MCRapper and TFP-R and show that they outperform the state-of-the-art for their respective tasks.
A Bayesian incorporated linear non-Gaussian acyclic model for multiple directed graph estimation to study brain emotion circuit development in adolescence
Zhang, Aiying, Zhang, Gemeng, Cai, Biao, Wilson, Tony W., Stephen, Julia M., Calhoun, Vince D., Wang, Yu-Ping
Emotion perception is essential to affective and cognitive development which involves distributed brain circuits. The ability of emotion identification begins in infancy and continues to develop throughout childhood and adolescence. Understanding the development of brain's emotion circuitry may help us explain the emotional changes observed during adolescence. Our previous study delineated the trajectory of brain functional connectivity (FC) from late childhood to early adulthood during emotion identification tasks. In this work, we endeavour to deepen our understanding from association to causation. We proposed a Bayesian incorporated linear non-Gaussian acyclic model (BiLiNGAM), which incorporated our previous association model into the prior estimation pipeline. In particular, it can jointly estimate multiple directed acyclic graphs (DAGs) for multiple age groups at different developmental stages. Simulation results indicated more stable and accurate performance over various settings, especially when the sample size was small (high-dimensional cases). We then applied to the analysis of real data from the Philadelphia Neurodevelopmental Cohort (PNC). This included 855 individuals aged 8-22 years who were divided into five different adolescent stages. Our network analysis revealed the development of emotion-related intra- and inter- modular connectivity and pinpointed several emotion-related hubs. We further categorized the hubs into two types: in-hubs and out-hubs, as the center of receiving and distributing information. Several unique developmental hub structures and group-specific patterns were also discovered. Our findings help provide a causal understanding of emotion development in the human brain.
A One-Pass Private Sketch for Most Machine Learning Tasks
Coleman, Benjamin, Shrivastava, Anshumali
Differential privacy (DP) is a compelling privacy definition that explains the privacy-utility tradeoff via formal, provable guarantees. Inspired by recent progress toward general-purpose data release algorithms, we propose a private sketch, or small summary of the dataset, that supports a multitude of machine learning tasks including regression, classification, density estimation, near-neighbor search, and more. Our sketch consists of randomized contingency tables that are indexed with locality-sensitive hashing and constructed with an efficient one-pass algorithm. We prove competitive error bounds for DP kernel density estimation. Existing methods for DP kernel density estimation scale poorly, often exponentially slower with an increase in dimensions. In contrast, our sketch can quickly run on large, high-dimensional datasets in a single pass. Exhaustive experiments show that our generic sketch delivers a similar privacy-utility tradeoff when compared to existing DP methods at a fraction of the computation cost. We expect that our sketch will enable differential privacy in distributed, large-scale machine learning settings.
Post-Hoc Methods for Debiasing Neural Networks
Savani, Yash, White, Colin, Govindarajulu, Naveen Sundar
As deep learning models become tasked with more and more decisions that impact human lives, such as hiring, criminal recidivism, and loan repayment, bias is becoming a growing concern. This has led to dozens of definitions of fairness and numerous algorithmic techniques to improve the fairness of neural networks. Most debiasing algorithms require retraining a neural network from scratch, however, this is not feasible in many applications, especially when the model takes days to train or when the full training dataset is no longer available. In this work, we present a study on post-hoc methods for debiasing neural networks. First we study the nature of the problem, showing that the difficulty of post-hoc debiasing is highly dependent on the initial conditions of the original model. Then we define three new fine-tuning techniques: random perturbation, layer-wise optimization, and adversarial fine-tuning. All three techniques work for any group fairness constraint. We give a comparison with six algorithms - three popular post-processing debiasing algorithms and our three proposed methods - across three datasets and three popular bias measures. We show that no post-hoc debiasing technique dominates all others, and we identify settings in which each algorithm performs the best. Our code is available at https://github.com/realityengines/post_hoc_debiasing.
A Survey of Machine Learning Methods and Challenges for Windows Malware Classification
Raff, Edward, Nicholas, Charles
Malware classification is a difficult problem, to which machine learning methods have been applied for decades. Yet progress has often been slow, in part due to a number of unique difficulties with the task that occur through all stages of the developing a machine learning system: data collection, labeling, feature creation and selection, model selection, and evaluation. In this survey we will review a number of the current methods and challenges related to malware classification, including data collection, feature extraction, and model construction, and evaluation. Our discussion will include thoughts on the constraints that must be considered for machine learning based solutions in this domain, and yet to be tackled problems for which machine learning could also provide a solution. This survey aims to be useful both to cybersecurity practitioners who wish to learn more about how machine learning can be applied to the malware problem, and to give data scientists the necessary background into the challenges in this uniquely complicated space.