Variance Norms for Kernelized Anomaly Detection
Cass, Thomas, Gonon, Lukas, Zozoulenko, Nikita
We present a unified theory for Mahalanobis-type anomaly detection on Banach spaces, using ideas from Cameron-Martin theory applied to non-Gaussian measures. This approach leads to a basis-free, data-driven notion of anomaly distance through the so-called variance norm of a probability measure, which can be consistently estimated using empirical measures. Our framework generalizes the classical $\mathbb{R}^d$, functional $(L^2[0,1])^d$, and kernelized settings, including the general case of non-injective covariance operator. We prove that the variance norm depends solely on the inner product in a given Hilbert space, and hence that the kernelized Mahalanobis distance can naturally be recovered by working on reproducing kernel Hilbert spaces. Using the variance norm, we introduce the notion of a kernelized nearest-neighbour Mahalanobis distance for semi-supervised anomaly detection. In an empirical study on 12 real-world datasets, we demonstrate that the kernelized nearest-neighbour Mahalanobis distance outperforms the traditional kernelized Mahalanobis distance for multivariate time series anomaly detection, using state-of-the-art time series kernels such as the signature, global alignment, and Volterra reservoir kernels. Moreover, we provide an initial theoretical justification of nearest-neighbour Mahalanobis distances by developing concentration inequalities in the finite-dimensional Gaussian case.
Jul-16-2024
- Country:
- North America > United States
- New York (0.04)
- Texas > Dallas County
- Dallas (0.04)
- Nevada > Clark County
- Las Vegas (0.04)
- Europe
- United Kingdom > England
- Greater London > London (0.04)
- Cambridgeshire > Cambridge (0.04)
- Switzerland > Basel-City
- Basel (0.04)
- United Kingdom > England
- Asia > India
- West Bengal > Kolkata (0.04)
- North America > United States
- Genre:
- Research Report (0.81)
- Industry:
- Information Technology (0.67)
- Education (0.46)
- Technology: