Anomaly Detection for Network Connection Logs

Mehta, Swapneel, Kothuri, Prasanth, Garcia, Daniel Lanza

arXiv.org Machine Learning 

We leverage a streaming architecture based on ELK, Spark and Hadoop in order to collect, store, and analyse database connection logs in near real-time. The proposed system investigates outliers using unsupervised learning; widely adopted clustering and classification algorithms for log data, highlighting the subtle variances in each model by visualisation of outliers. Arriving at a novel solution to evaluate untagged, unfiltered connection logs, we propose an approach that can be extrapolated to a generalised system of analysing connection logs across a large infrastructure comprising thousands of individual nodes and generating hundreds of lines in logs per second. I. INTRODUCTION Anomaly detection has provided a classic problem statement across multifarious use-cases ranging from scientific observations to financial transactions. We define an anomaly as a single observation or a set thereof, that fails to conform to a group of properties exhibited by larger collections of such observations.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found