Subset Multivariate Collective And Point Anomaly Detection

Fisch, Alexander T M, Eckley, Idris A, Fearnhead, Paul

arXiv.org Machine Learning 

In recent years, there has been a growing interest in identifying anomalous structure within multivariate data streams. We consider the problem of detecting collective anomalies, corresponding to intervals where one or more of the data streams behaves anomalously. We first develop a test for a single collective anomaly that has power to simultaneously detect anomalies that are either rare, that is affecting few data streams, or common. We then show how to detect multiple anomalies in a way that is computationally efficient but avoids the approximations inherent in binary segmentation-like approaches. This approach, which we call MVCAPA, is shown to consistently estimate the number and location of the collective anomalies, a property that has not previously been shown for competing methods. MVCAPA can be made robust to point anomalies and can allow for the anomalies to be imperfectly aligned. We show the practical usefulness of allowing for imperfect alignments through a resulting increase in power to detect regions of copy number variation.