Goto

Collaborating Authors

 Accuracy


Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection

arXiv.org Machine Learning

In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology and framework for efficient and effective real-time malware detection, leveraging the best of conventional machine learning (ML) and deep learning (DL) algorithms. In PROPEDEUTICA, all software processes in the system start execution subjected to a conventional ML detector for fast classification. If a piece of software receives a borderline classification, it is subjected to further analysis via more performance expensive and more accurate DL methods, via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays to the execution of software subjected to deep learning analysis as a way to "buy time" for DL analysis and to rate-limit the impact of possible malware in the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and 877 commonly used benign software samples from various categories for the Windows OS. Our results show that the false positive rate for conventional ML methods can reach 20%, and for modern DL methods it is usually below 6%. However, the classification time for DL can be 100X longer than conventional ML methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the percentage of software subjected to DL analysis was approximately 40% on average. Further, the application of delays in software subjected to ML reduced the detection time by approximately 10%. Finally, we found and discussed a discrepancy between the detection accuracy offline (analysis after all traces are collected) and on-the-fly (analysis in tandem with trace collection). Our insights show that conventional ML and modern DL-based malware detectors in isolation cannot meet the needs of efficient and effective malware detection: high accuracy, low false positive rate, and short classification time.


Where Classification Fails, Interpretation Rises

arXiv.org Machine Learning

An intriguing property of deep neural networks is their inherent vulnerability to adversarial inputs, which significantly hinders their application in security-critical domains. Most existing detection methods attempt to use carefully engineered patterns to distinguish adversarial inputs from their genuine counterparts, which however can often be circumvented by adaptive adversaries. In this work, we take a completely different route by leveraging the definition of adversarial inputs: while deceiving for deep neural networks, they are barely discernible for human visions. Building upon recent advances in interpretable models, we construct a new detection framework that contrasts an input's interpretation against its classification. We validate the efficacy of this framework through extensive experiments using benchmark datasets and attacks. We believe that this work opens a new direction for designing adversarial input detection methods.


Distributional Equivalence and Structure Learning for Bow-free Acyclic Path Diagrams

arXiv.org Machine Learning

We consider the problem of structure learning for bow-free acyclic path diagrams (BAPs). BAPs can be viewed as a generalization of linear Gaussian DAG models that allow for certain hidden variables. We present a first method for this problem using a greedy score-based search algorithm. We also prove some necessary and some sufficient conditions for distributional equivalence of BAPs which are used in an algorithmic ap- proach to compute (nearly) equivalent model structures. This allows us to infer lower bounds of causal effects. We also present applications to real and simulated datasets using our publicly available R-package.


Subject Selection on a Riemannian Manifold for Unsupervised Cross-subject Seizure Detection

arXiv.org Machine Learning

Inter-subject variability between individuals poses a challenge in inter-subject brain signal analysis problems. A new algorithm for subject-selection based on clustering covariance matrices on a Riemannian manifold is proposed. After unsupervised selection of the subsets of relevant subjects, data in a cluster is mapped to a tangent space at the mean point of covariance matrices in that cluster and an SVM classifier on labeled data from relevant subjects is trained. Experiment on an EEG seizure database shows that the proposed method increases the accuracy over state-of-the-art from 86.83% to 89.84% and specificity from 87.38% to 89.64% while reducing the false positive rate/hour from 0.8/hour to 0.77/hour.


Predicting Adolescent Suicide Attempts with Neural Networks

arXiv.org Machine Learning

Though suicide is a major public health problem in the US, machine learning methods are not commonly used to predict an individual's risk of attempting/committing suicide. In the present work, starting with an anonymized collection of electronic health records for 522,056 unique, California-resident adolescents, we develop neural network models to predict suicide attempts. We frame the problem as a binary classification problem in which we use a patient's data from 2006-2009 to predict either the presence (1) or absence (0) of a suicide attempt in 2010. After addressing issues such as severely imbalanced classes and the variable length of a patient's history, we build neural networks with depths varying from two to eight hidden layers. For test set observations where we have at least five ED/hospital visits' worth of data on a patient, our depth-4 model achieves a sensitivity of 0.703, specificity of 0.980, and AUC of 0.958.


Implementing Machine Learning Using Python and Scikit-learn

#artificialintelligence

For machine learning, you can also use these libraries to build learning models. However, doing so requires that you have a strong appreciation of the mathematical foundation for the various machine learning algorithms.


[D] Weighing softmax predictions based on the validation set confusion matrix, does it make sense? โ€ข r/MachineLearning

@machinelearnbot

Suppose I have a classification neuralnet for which I compute the confusion matrix on the validation set after my network has converged. What ways are there of using this matrix to reliably increase the accuracy on unseen data? I know of setting a per-class minimum confidence threshold. But would it make sense to reponder the softmax predictions knowing that some class A is often misclassified as B by the network etc...?


An Improved Naive Bayes Classifier-based Noise Detection Technique for Classifying User Phone Call Behavior

arXiv.org Machine Learning

The presence of noisy instances in mobile phone data is a fundamental issue for classifying user phone call behavior (i.e., accept, reject, missed and outgoing), with many potential negative consequences. The classification accuracy may decrease and the complexity of the classifiers may increase due to the number of redundant training samples. To detect such noisy instances from a training dataset, researchers use naive Bayes classifier (NBC) as it identifies misclassified instances by taking into account independence assumption and conditional probabilities of the attributes. However, some of these misclassified instances might indicate usages behavioral patterns of individual mobile phone users. Existing naive Bayes classifier based noise detection techniques have not considered this issue and, thus, are lacking in classification accuracy. In this paper, we propose an improved noise detection technique based on naive Bayes classifier for effectively classifying users' phone call behaviors. In order to improve the classification accuracy, we effectively identify noisy instances from the training dataset by analyzing the behavioral patterns of individuals. We dynamically determine a noise threshold according to individual's unique behavioral patterns by using both the naive Bayes classifier and Laplace estimator. We use this noise threshold to identify noisy instances. To measure the effectiveness of our technique in classifying user phone call behavior, we employ the most popular classification algorithm (e.g., decision tree). Experimental results on the real phone call log dataset show that our proposed technique more accurately identifies the noisy instances from the training datasets that leads to better classification accuracy.


Learning Certifiably Optimal Rule Lists for Categorical Data

arXiv.org Machine Learning

We present the design and implementation of a custom discrete optimization technique for building rule lists over a categorical feature space. Our algorithm produces rule lists with optimal training performance, according to the regularized empirical risk, with a certificate of optimality. By leveraging algorithmic bounds, efficient data structures, and computational reuse, we achieve several orders of magnitude speedup in time and a massive reduction of memory consumption. We demonstrate that our approach produces optimal rule lists on practical problems in seconds. Our results indicate that it is possible to construct optimal sparse rule lists that are approximately as accurate as the COMPAS proprietary risk prediction tool on data from Broward County, Florida, but that are completely interpretable. This framework is a novel alternative to CART and other decision tree methods for interpretable modeling.


Beyond Parity: Fairness Objectives for Collaborative Filtering

arXiv.org Artificial Intelligence

We study fairness in collaborative-filtering recommender systems, which are sensitive to discrimination that exists in historical data. Biased data can lead collaborative-filtering methods to make unfair predictions for users from minority groups. We identify the insufficiency of existing fairness metrics and propose four new metrics that address different forms of unfairness. These fairness metrics can be optimized by adding fairness terms to the learning objective. Experiments on synthetic and real data show that our new metrics can better measure fairness than the baseline, and that the fairness objectives effectively help reduce unfairness.