Applications of Signature Methods to Market Anomaly Detection

Akyildirim, Erdinc, Gambara, Matteo, Teichmann, Josef, Zhou, Syang

arXiv.org Machine Learning 

While these instances are called outliers (anomalies), the normal instances are called inliers. Anomaly detection is a fundamental research problem that has been investigated by researchers from diverse research fields and application areas. Anomaly detection can be made manually by searching through whole data clouds to diagnose the problem, but clearly this is a long and labourintensive process. Anomaly detection often appears in the context of uncertainty, i.e. absence, principal or not, of knowledge on the data generating process. Hence, over time, a plethora of anomaly detection techniques ranging from simple statistical techniques to complex machine learning algorithms has been developed for certain application areas such as fraud detection in financial transactions (West and Bhattacharya (2016)), fault detection in production (Miljković (2011)), intrusion detection in a computer network (Sabahi and Movaghar (2008)), etc. Some of the well known statistical methods such as z-score, Tukey method (Interquartile Range) or Gaussian Mixture models can be useful for the initial screening of outliers. Although these statistical or econometric anomaly detection methods have been well rooted in the literature (we refer the reader to Chandola et al. (2009) for an extensive review) dating back to Edgeworth (1887), many of them have failed to provide sufficient performance and accuracy in the last decade. This is mainly in view of big data collected from various sources such as financial transactions, health records, and surveillance logs etc. Nowadays high-volume, high-velocity, and high-variety data sets demand cost-effective novel data analytics for decision-making and to infer useful insights