In our previous post, we explained what time series data is and provided some details as to how the Anodot time series anomaly detection system is able to spot anomalies in time series data. We also discussed the importance of choosing a model for a metric's normal behavior which included any and all seasonal patterns in the metric, and the specific algorithm which Anodot uses to find seasonal patterns. At the end of that post we said it's possible to get a sense of the bigger picture from a lot of individual anomalies. Conciseness is a requirement of any large-scale anomaly detection system because monitoring millions of metrics is guaranteed to generate a flood of reported anomalies, even if there are zero false positives. Achieving conciseness in this context is analogous to distilling the many individual symptoms into a single diagnosis, in much the same way that a mechanic might diagnose a car problem by observing the pitch, volume, and duration of all the sounds it makes, in addition to watching all the dials and indicator lights on the dashboard.

But I'll give you a quick refresher of what a univariate time series is, before going into the details of a multivariate time series. Let's look at them one by one to understand the difference. A univariate time series, as the name suggests, is a series with a single time-dependent variable. For example, have a look at the sample dataset below that consists of the temperature values (each hour), for the past 2 years. Here, temperature is the dependent variable (dependent on Time).

For testing two vector random variables for independence, we propose testing whether the distance of one vector from an arbitrary center point is independent from the distance of the other vector from another arbitrary center point by a univariate test. We prove that under minimal assumptions, it is enough to have a consistent univariate independence test on the distances, to guarantee that the power to detect dependence between the random vectors increases to one with sample size. If the univariate test is distribution-free, the multivariate test will also be distribution-free. If we consider multiple center points and aggregate the center-specific univariate tests, the power may be further improved, and the resulting multivariate test may be distribution-free for specific aggregation methods (if the univariate test is distribution-free). We show that certain multivariate tests recently proposed in the literature can be viewed as instances of this general approach.

Yildiz, Olcay Taner, Alpaydin, Ethem

Statistical tests that compare classification algorithms are univariate and use a single performance measure, e.g., misclassification error, $F$ measure, AUC, and so on. In multivariate tests, comparison is done using multiple measures simultaneously. For example, error is the sum of false positives and false negatives and a univariate test on error cannot make a distinction between these two sources, but a 2-variate test can. Similarly, instead of combining precision and recall in $F$ measure, we can have a 2-variate test on (precision, recall). We use Hotelling's multivariate $T^2$ test for comparing two algorithms, and when we have three or more algorithms we use the multivariate analysis of variance (MANOVA) followed by pairwise post hoc tests. In our experiments, we see that multivariate tests have higher power than univariate tests, that is, they can detect differences that univariate tests cannot. We also discuss how multivariate analysis allows us to automatically extract performance measures that best distinguish the behavior of multiple algorithms.

Tompkins, Anthony, Ramos, Fabio

Periodicity is often studied in timeseries modelling with autoregressive methods but is less popular in the kernel literature, particularly for higher dimensional problems such as in textures, crystallography, and quantum mechanics. Large datasets often make modelling periodicity untenable for otherwise powerful non-parametric methods like Gaussian Processes (GPs) which typically incur an $\mathcal{O}(N^3)$ computational burden and, consequently, are unable to scale to larger datasets. To this end we introduce a method termed \emph{Index Set Fourier Series Features} to tractably exploit multivariate Fourier series and efficiently decompose periodic kernels on higher-dimensional data into a series of basis functions. We show that our approximation produces significantly less predictive error than alternative approaches such as those based on random Fourier features and achieves better generalisation on regression problems with periodic data.