Optimal Copula Transport for Clustering Multivariate Time Series
Marti, Gautier, Nielsen, Frank, Donnat, Philippe
Hellebore Capital Management † Ecole Polytechnique ABSTRACT This paper presents a new methodology for clustering multivariate time series leveraging optimal transport between copulas. Copulas are used to encode both (i) intra-dependence of a multivariate time series, and (ii) interdependence between two time series. Then, optimal copula transport allows us to define two distances between multivariate time series: (i) one for measuring intra-dependence dissimilarity, (ii) another one for measuring interdependence dissimilarity based on a new multivariate dependence coefficient which is robust to noise, deterministic, and which can target specified dependencies. Index Terms-- Clustering; Multivariate Time Series; Optimal Transport; Earth Mover's Distance; Empirical Copula; Dependence Coefficient 1. INTRODUCTION Clustering is the task of grouping a set of objects in such a way that objects in the same group, also called cluster, are more similar to each other than those in different groups. This primitive in unsupervised machine learning is known to be hard to formalize and hard to solve.
Jan-11-2016