Collaborating Authors


How to build robust anomaly detectors with machine learning


Modern software applications are often comprised of distributed microservices. Consider typical Software as a Service (SaaS) applications, which are accessed through web interfaces and run on the cloud. In part due to their physically distributed nature, managing and monitoring performance in these complex systems is becoming increasingly difficult. When issues such as performance degradations arise, it can be challenging to identify and debug the root causes. At Ericsson's Global AI Accelerator, we're exploring data-science based monitoring solutions that can learn to identify and categorize anomalous system behavior, and thereby improve incident resolution times.

Copula-based anomaly scoring and localization for large-scale, high-dimensional continuous data Machine Learning

The anomaly detection method presented by this paper has a special feature: it does not only indicate whether an observation is anomalous or not but also tells what exactly makes an anomalous observation unusual. Hence, it provides support to localize the reason of the anomaly. The proposed approach is model-based; it relies on the multivariate probability distribution associated with the observations. Since the rare events are present in the tails of the probability distributions, we use copula functions, that are able to model the fat-tailed distributions well. The presented procedure scales well; it can cope with a large number of high-dimensional samples. Furthermore, our procedure can cope with missing values, too, which occur frequently in high-dimensional data sets. In the second part of the paper, we demonstrate the usability of the method through a case study, where we analyze a large data set consisting of the performance counters of a real mobile telecommunication network. Since such networks are complex systems, the signs of sub-optimal operation can remain hidden for a potentially long time. With the proposed procedure, many such hidden issues can be isolated and indicated to the network operator.

Anomaly Detection in Large Scale Networks with Latent Space Models Machine Learning

We develop a real-time anomaly detection algorithm for directed activity on large, sparse networks. We model the propensity for future activity using a dynamic logistic model with interaction terms for sender- and receiver-specific latent factors in addition to sender- and receiver-specific popularity scores; deviations from this underlying model constitute potential anomalies. Latent nodal attributes are estimated via a variational Bayesian approach and may change over time, representing natural shifts in network activity. Estimation is augmented with a case-control approximation to take advantage of the sparsity of the network and reduces computational complexity from $O(N^2)$ to $O(E)$, where $N$ is the number of nodes and $E$ is the number of observed edges. We run our algorithm on network event records collected from an enterprise network of over 25,000 computers and are able to identify a red team attack with half the detection rate required of the model without latent interaction terms.

Diversifying Database Activity Monitoring with Bandits Artificial Intelligence

Database activity monitoring (DAM) systems are commonly used by organizations to protect the organizational data, knowledge and intellectual properties. In order to protect organizations database DAM systems have two main roles, monitoring (documenting activity) and alerting to anomalous activity. Due to high-velocity streams and operating costs, such systems are restricted to examining only a sample of the activity. Current solutions use policies, manually crafted by experts, to decide which transactions to monitor and log. This limits the diversity of the data collected. Bandit algorithms, which use reward functions as the basis for optimization while adding diversity to the recommended set, have gained increased attention in recommendation systems for improving diversity. In this work, we redefine the data sampling problem as a special case of the multi-armed bandit (MAB) problem and present a novel algorithm, which combines expert knowledge with random exploration. We analyze the effect of diversity on coverage and downstream event detection tasks using a simulated dataset. In doing so, we find that adding diversity to the sampling using the bandit-based approach works well for this task and maximizing population coverage without decreasing the quality in terms of issuing alerts about events.

Anomaly Detection with Joint Representation Learning of Content and Connection Machine Learning

Social media sites are becoming a key factor in politics. These platforms are easy to manipulate for the purpose of distorting information space to confuse and distract voters. Past works to identify disruptive patterns are mostly focused on analyzing the content of tweets. In this study, we jointly embed the information from both user posted content as well as a user's follower network, to detect groups of densely connected users in an unsupervised fashion. We then investigate these dense sub-blocks of users to flag anomalous behavior. In our experiments, we study the tweets related to the upcoming 2019 Canadian Elections, and observe a set of densely-connected users engaging in local politics in different provinces, and exhibiting troll-like behavior.

Visual Analytics of Anomalous User Behaviors: A Survey Machine Learning

The increasing accessibility of data provides substantial opportunities for understanding user behaviors. Unearthing anomalies in user behaviors is of particular importance as it helps signal harmful incidents such as network intrusions, terrorist activities, and financial frauds. Many visual analytics methods have been proposed to help understand user behavior-related data in various application domains. In this work, we survey the state of art in visual analytics of anomalous user behaviors and classify them into four categories including social interaction, travel, network communication, and transaction. We further examine the research works in each category in terms of data types, anomaly detection techniques, and visualization techniques, and interaction methods. Finally, we discuss the findings and potential research directions.

Detecting the Onset of a Network Layer DoS Attack with a Graph-Based Approach

AAAI Conferences

A denial-of-service (DoS) attack is a malicious act with the goal of interrupting the access to a computer network. The result of DoS attack can cause the computers on the network to squander their resources to serve illegitimate requests that result in a disruption of the network's services to legitimate users. With a sophisticated DoS attack, it becomes difficult to distinguish malicious requests from legitimate requests. Since a network layer DoS attack can cause interruptions to a network while causing collateral damage, it is vital to understand the measures to mitigate against such attacks. Generally, approaches that implement distribution charts based on statistical analysis or honeypots have been applied to detect a DoS attack. However, this is usually too late, as the damage is already done. We hypothesize in this work that a graph-based approach can provide the capability to identify a DoS attack at its inception. A graph-based approach will also allow us to not only focus on anomalies within an entity (like a computer) but also allow us to analyze the anomalies that exist in an entity's relationship with other entities, thus providing a rich source of contextual analysis. We demonstrate our proposed approach using a publicly-available dataset.

A Probabilistic Framework to Node-level Anomaly Detection in Communication Networks Machine Learning

Abstract--In this paper we consider the task of detecting abnormal communication volume occurring at node-level in communication networks. The signal of the communication activity is modeled by means of a clique stream: each occurring communication event is instantaneous and activates an undirected subgraph spanning over a set of equally participating nodes. We present a probabilistic framework to model and assess the communication volume observed at any single node. Specifically, we employ nonparametric regression to learn the probability that a node takes part in a certain event knowing the set of other nodes that are involved. On the top of that, we present a concentration inequality around the estimated volume of events in which a node could participate, which in turn allows us to build an efficient and interpretable anomaly scoring function. Finally, the superior performance of the proposed approach is empirically demonstrated in real-world sensor network data, as well as using synthetic communication activity that is in accordance with that latter setting. I. INTRODUCTION Monitoring the activity in communication networks has become a popular area of research and particular attention has been paid to detection tasks such as spotting events or anomalies. Aneffective way to represent the communication activity is via a dynamic graph where the entities are considered to be nodes, and each communication event (or more simply event) to be represented by a set of connecting edges that appear at a specific time interval.

Cyber Anomaly Detection Using Graph-node Role-dynamics Machine Learning

Intrusion detection systems (IDSs) generate valuable knowledge about network security, but an abundance of false alarms and a lack of methods to capture the interdependence among alerts hampers their utility for network defense. Here, we explore a graph-based approach for fusing alerts generated by multiple IDSs (e.g., Snort, OSSEC, and Bro). Our approach generates a weighted graph of alert fields (not network topology) that makes explicit the connections between multiple alerts, IDS systems, and other cyber artifacts. We use this multi-modal graph to identify anomalous changes in the alert patterns of a network. To detect the anomalies, we apply the role-dynamics approach, which has successfully identified anomalies in social media, email, and IP communication graphs. In the cyber domain, each node (alert field) in the fused IDS alert graph is assigned a probability distribution across a small set of roles based on that node's features. A cyber attack should trigger IDS alerts and cause changes in the node features, but rather than track every feature for every alert-field node individually, roles provide a succinct, integrated summary of those feature changes. We measure changes in each node's probabilistic role assignment over time, and identify anomalies as deviations from expected roles. We test our approach using simulations including three weeks of normal background traffic, as well as cyber attacks that occur near the end of the simulations. This paper presents a novel approach to multi-modal data fusion and a novel application of role dynamics within the cyber-security domain. Our results show a drastic decrease in the false-positive rate when considering our anomaly indicator instead of the IDS alerts themselves, thereby reducing alarm fatigue and providing a promising avenue for threat intelligence in network defense.

Machine learning: Disrupting the cyber security industry


Despite the emergence of apps like Slack and Yammer for internal employee communication, email is still the dominant form of external employee communication for enterprises. "In a similar way that computers, servers and devices communicate with one another through data packets transmitted via TCP/IP, employees communicate with one another through natural language and documents shared via email," says Bishop. Why are account takeovers on the rise? And how can businesses prevent this method of attack? Asaf Cidon, from Barracuda Networks, helps Information Age answer these questions.