AITopics | variance norm

Collaborating Authors

variance norm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Variance Norms for Kernelized Anomaly Detection

Cass, Thomas, Gonon, Lukas, Zozoulenko, Nikita

arXiv.org Machine LearningJul-16-2024

We present a unified theory for Mahalanobis-type anomaly detection on Banach spaces, using ideas from Cameron-Martin theory applied to non-Gaussian measures. This approach leads to a basis-free, data-driven notion of anomaly distance through the so-called variance norm of a probability measure, which can be consistently estimated using empirical measures. Our framework generalizes the classical $\mathbb{R}^d$, functional $(L^2[0,1])^d$, and kernelized settings, including the general case of non-injective covariance operator. We prove that the variance norm depends solely on the inner product in a given Hilbert space, and hence that the kernelized Mahalanobis distance can naturally be recovered by working on reproducing kernel Hilbert spaces. Using the variance norm, we introduce the notion of a kernelized nearest-neighbour Mahalanobis distance for semi-supervised anomaly detection. In an empirical study on 12 real-world datasets, we demonstrate that the kernelized nearest-neighbour Mahalanobis distance outperforms the traditional kernelized Mahalanobis distance for multivariate time series anomaly detection, using state-of-the-art time series kernels such as the signature, global alignment, and Volterra reservoir kernels. Moreover, we provide an initial theoretical justification of nearest-neighbour Mahalanobis distances by developing concentration inequalities in the finite-dimensional Gaussian case.

kernel, mahalanobis distance, variance norm, (14 more...)

arXiv.org Machine Learning

2407.11873

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > New York (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(4 more...)

Genre: Research Report (0.81)

Industry:

Information Technology (0.67)
Education (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

Add feedback

Anomaly detection on streamed data

Cochrane, Thomas, Foster, Peter, Lyons, Terry, Arribas, Imanol Perez

arXiv.org Machine LearningJun-5-2020

We introduce powerful but simple methodology for identifying anomalous observations against a corpus of `normal' observations. All data are observed through a vector-valued feature map. Our approach depends on the choice of corpus and that feature map but is invariant to affine transformations of the map and has no other external dependencies, such as choices of metric; we call it conformance. Applying this method to (signatures) of time series and other types of streamed data we provide an effective methodology of broad applicability for identifying anomalous complex multimodal sequential data. We demonstrate the applicability and effectiveness of our method by evaluating it against multiple data sets. Based on quantifying performance using the receiver operating characteristic (ROC) area under the curve (AUC), our method yields an AUC score of 98.9\% for the PenDigits data set; in a subsequent experiment involving marine vessel traffic data our approach yields an AUC score of 89.1\%. Based on comparison involving univariate time series from the UEA \& UCR time series repository with performance quantified using balanced accuracy and assuming an optimal operating point, our approach outperforms a state-of-the-art shapelet method for 19 out of 28 data sets.

corpus, data mining, machine learning, (15 more...)

arXiv.org Machine Learning

2006.03487

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Virginia (0.04)

Genre: Research Report (0.50)

Industry:

Government (0.68)
Law Enforcement & Public Safety (0.68)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.74)

Add feedback