Goto

Collaborating Authors

 Performance Analysis


Online Multivariate Anomaly Detection and Localization for High-dimensional Settings

arXiv.org Machine Learning

This paper considers the real-time detection of anomalies in high-dimensional systems. The goal is to detect anomalies quickly and accurately so that the appropriate countermeasures could be taken in time, before the system possibly gets harmed. We propose a sequential and multivariate anomaly detection method that scales well to high-dimensional datasets. The proposed method follows a nonparametric, i.e., data-driven, and semi-supervised approach, i.e., trains only on nominal data. Thus, it is applicable to a wide range of applications and data types. Thanks to its multivariate nature, it can quickly and accurately detect challenging anomalies, such as changes in the correlation structure and stealth low-rate cyberattacks. Its asymptotic optimality and computational complexity are comprehensively analyzed. In conjunction with the detection method, an effective technique for localizing the anomalous data dimensions is also proposed. We further extend the proposed detection and localization methods to a supervised setup where an additional anomaly dataset is available, and combine the proposed semi-supervised and supervised algorithms to obtain an online learning algorithm under the semi-supervised framework. The practical use of proposed algorithms are demonstrated in DDoS attack mitigation, and their performances are evaluated using a real IoT-botnet dataset and simulations.


Learning from Context: Exploiting and Interpreting File Path Information for Better Malware Detection

arXiv.org Artificial Intelligence

Machine learning (ML) used for static portable executable (PE) malware detection typically employs per-file numerical feature vector representations as input with one or more target labels during training. However, there is much orthogonal information that can be gleaned from the \textit{context} in which the file was seen. In this paper, we propose utilizing a static source of contextual information -- the path of the PE file -- as an auxiliary input to the classifier. While file paths are not malicious or benign in and of themselves, they do provide valuable context for a malicious/benign determination. Unlike dynamic contextual information, file paths are available with little overhead and can seamlessly be integrated into a multi-view static ML detector, yielding higher detection rates at very high throughput with minimal infrastructural changes. Here we propose a multi-view neural network, which takes feature vectors from PE file content as well as corresponding file paths as inputs and outputs a detection score. To ensure realistic evaluation, we use a dataset of approximately 10 million samples -- files and file paths from user endpoints of an actual security vendor network. We then conduct an interpretability analysis via LIME modeling to ensure that our classifier has learned a sensible representation and see which parts of the file path most contributed to change in the classifier's score. We find that our model learns useful aspects of the file path for classification, while also learning artifacts from customers testing the vendor's product, e.g., by downloading a directory of malware samples each named as their hash. We prune these artifacts from our test dataset and demonstrate reductions in false negative rate of 32.3% at a $10^{-3}$ false positive rate (FPR) and 33.1% at $10^{-4}$ FPR, over a similar topology single input PE file content only model.


KitcheNette: Predicting and Recommending Food Ingredient Pairings using Siamese Neural Networks

arXiv.org Machine Learning

As a vast number of ingredients exist in the culinary world, there are countless food ingredient pairings, but only a small number of pairings have been adopted by chefs and studied by food researchers. In this work, we propose KitcheNette which is a model that predicts food ingredient pairing scores and recommends optimal ingredient pairings. KitcheNette employs Siamese neural networks and is trained on our annotated dataset containing 300K scores of pairings generated from numerous ingredients in food recipes. As the results demonstrate, our model not only outperforms other baseline models but also can recommend complementary food pairings and discover novel ingredient pairings.


Vector Field Neural Networks

arXiv.org Machine Learning

This work begins by establishing a mathematical formalization between different geometrical interpretations of Neural Networks, providing a first contribution. From this starting point, a new interpretation is explored, using the idea of implicit vector fields moving data as particles in a flow. A new architecture, Vector Fields Neural Networks(VFNN), is proposed based on this interpretation, with the vector field becoming explicit. A specific implementation of the VFNN using Euler's method to solve ordinary differential equations (ODEs) and gaussian vector fields is tested. The first experiments present visual results remarking the important features of the new architecture and providing another contribution with the geometrically interpretable regularization of model parameters. Then, the new architecture is evaluated for different hyperparameters and inputs, with the objective of evaluating the influence on model performance, computational time, and complexity. The VFNN model is compared against the known basic models Naive Bayes, Feed Forward Neural Networks, and Support Vector Machines(SVM), showing comparable, or better, results for different datasets. Finally, the conclusion provides many new questions and ideas for improvement of the model that can be used to increase model performance.


Fairness in Machine Learning with Tractable Models

arXiv.org Machine Learning

Machine Learning techniques have become pervasive across a range of different applications, and are now widely used in areas as disparate as recidivism prediction, consumer credit-risk analysis and insurance pricing. The prevalence of machine learning techniques has raised concerns about the potential for learned algorithms to become biased against certain groups. Many definitions have been proposed in the literature, but the fundamental task of reasoning about probabilistic events is a challenging one, owing to the intractability of inference. The focus of this paper is taking steps towards the application of tractable models to fairness. Tractable probabilistic models have emerged that guarantee that conditional marginal can be computed in time linear in the size of the model. In particular, we show that sum product networks (SPNs) enable an effective technique for determining the statistical relationships between protected attributes and other training variables. If a subset of these training variables are found by the SPN to be independent of the training attribute then they can be considered `safe' variables, from which we can train a classification model without concern that the resulting classifier will result in disparate outcomes for different demographic groups. Our initial experiments on the `German Credit' data set indicate that this processing technique significantly reduces disparate treatment of male and female credit applicants, with a small reduction in classification accuracy compared to state of the art. We will also motivate the concept of "fairness through percentile equivalence", a new definition predicated on the notion that individuals at the same percentile of their respective distributions should be treated equivalently, and this prevents unfair penalisation of those individuals who lie at the extremities of their respective distributions.


Accuracy Improvement of Neural Network Training using Particle Swarm Optimization and its Stability Analysis for Classification

arXiv.org Machine Learning

Supervised classification is the most active and emerging research trends in today's scenario. In this view, Artificial Neural Network (ANN) techniques have been widely employed and growing interest to the researchers day by day. ANN training aims to find the proper setting of parameters such as weights ($\textbf{W}$) and biases ($b$) to properly classify the given data samples. The training process is formulated in an error minimization problem which consists of many local optima in the search landscape. In this paper, an enhanced Particle Swarm Optimization is proposed to minimize the error function for classifying real-life data sets. A stability analysis is performed to establish the efficiency of the proposed method for improving classification accuracy. The performance measurement such as confusion matrix, $F$-measure and convergence graph indicates the significant improvement in the classification accuracy.


Discovering Suspicious Patterns Using a Graph Based Approach

AAAI Conferences

Recently, there has been much attention on tools and techniques for visualizing and acquiring new knowledge and insights. In the VAST 2018 competition, one of the challenges is to discover the fraudulent group of employees at Kasios, a furniture manufacturing company. In this work, we use a graph-based approach that analyzes the data for suspicious employee activities at Kasios. Graph based approaches enable one to handle rich contextual data and provide a deeper understanding of data due to the ability to discover patterns in databases that are not easily found using traditional query or statistical tools. We focus on graph based knowledge discovery in structural data to mine for interesting patterns and anomalies. Our approach first reports the normative patterns in the data, and then discovers any anomalous patterns associated with the previously discovered patterns. For visualizing the suspicious patterns, we also use the enterprise graph database Neo4j. Neo4j Browser provides a way to visualize graph structures.


SMART: Semantic Malware Attribute Relevance Tagging

arXiv.org Machine Learning

With the rapid proliferation and increased sophistication of malicious software (malware), detection methods no longer rely only on manually generated signatures but have also incorporated more general approaches like Machine Learning (ML) detection. Although powerful for conviction of malicious artifacts, these methods do not produce any further information about the type of malware that has been detected. In this work, we address the information gap between ML and signature-based detection methods by introducing an ML-based tagging model that generates human interpretable semantic descriptions of malicious software (e.g. file-infector, coin-miner), and argue that for less prevalent malware campaigns these provide potentially more useful and flexible information than malware family names. For this, we first introduce a method for deriving high-level descriptions of malware files from an ensemble of vendor family names. Then we formalize the problem of malware description as a tagging problem and propose a joint embedding deep neural network architecture that can learn to characterize portable executable (PE) files based on static analysis, thus not requiring a dynamic trace to identify behaviors at deployment time. We empirically demonstrate that when evaluated against tags extracted from an ensemble of anti-virus detection names, the proposed tagging model correctly identifies more than 93.7% of eleven possible tag descriptions for a given sample, at a deployable false positive rate (FPR) of 1% per tag. Furthermore, we show that when evaluating this model against ground truth tags derived from the results of dynamic analysis, it correctly predicts 93.5% of the labels for a given sample. These results suggest that an ML tagging model can be effectively deployed alongside a detection model for malware description.


Detecting Slow HTTP POST DoS Attacks Using Netflow Features

AAAI Conferences

Network security is a constant challenge, with new attacks and vulnerabilities being frequently introduced. Application layer Denial of Service (DoS) attacks are a rising attack variant, which inflicts network stress and service interruptions. The implementation of detection and mitigation techniques for such attacks have been a priority for some time, but more sophisticated attack permutations are constantly being introduced, often making prior prevention techniques ineffective. In this work, we focus specifically on the detection of Slow HTTP POST DoS attacks. We execute several Slow HTTP POST attack configurations within a live network environment to represent a real-world attack scenario, with varying levels of severity. For our methodology, we utilize features of network flow (Netflow) traffic to detect these attack configurations. Netflow has proven to be a more scalable solution compared to full packet capture when performing data collection, allowing for near real-time network monitoring. Eight machine learners were implemented to determine which learner would achieve optimal performance metrics when detecting Slow HTTP POST attacks. As our data is very large, we also evaluate the use of data sampling techniques to increase attack detection performance. Overall, our results show a high detection rate when detecting Slow HTTP POST attacks, achieving relatively low false alarm rates.


Detecting the Onset of a Network Layer DoS Attack with a Graph-Based Approach

AAAI Conferences

A denial-of-service (DoS) attack is a malicious act with the goal of interrupting the access to a computer network. The result of DoS attack can cause the computers on the network to squander their resources to serve illegitimate requests that result in a disruption of the network’s services to legitimate users. With a sophisticated DoS attack, it becomes difficult to distinguish malicious requests from legitimate requests. Since a network layer DoS attack can cause interruptions to a network while causing collateral damage, it is vital to understand the measures to mitigate against such attacks. Generally, approaches that implement distribution charts based on statistical analysis or honeypots have been applied to detect a DoS attack. However, this is usually too late, as the damage is already done. We hypothesize in this work that a graph-based approach can provide the capability to identify a DoS attack at its inception. A graph-based approach will also allow us to not only focus on anomalies within an entity (like a computer) but also allow us to analyze the anomalies that exist in an entity’s relationship with other entities, thus providing a rich source of contextual analysis. We demonstrate our proposed approach using a publicly-available dataset.