Beyond Point Matching: Evaluating Multiscale Dubuc Distance for Time Series Similarity

Ahmadzadeh, Azim, Khazaei, Mahsa, Rohlfing, Elaina

arXiv.org Artificial Intelligence 

Abstract--Time series are high-dimensional and complex data objects, making their efficient search and indexing a longstanding challenge in data mining. Building on a recently introduced similarity measure, namely Multiscale Dubuc Distance (MDD), this paper investigates its comparative strengths and limitations relative to the widely used Dynamic Time Warping (DTW). MDD is novel in two key ways: it evaluates time series similarity across multiple temporal scales and avoids point-to-point alignment. We demonstrate that in many scenarios where MDD outperforms DTW, the gains are substantial, and we provide a detailed analysis of the specific performance gaps it addresses. We provide simulations, in addition to the 95 datasets from the UCR archive, to test our hypotheses. Finally, we apply both methods to a challenging real-world classification task and show that MDD yields a significant improvement over DTW, underscoring its practical utility. Time series, or more generally, ordered high-dimensional data types, have become increasingly prevalent with the rise of powerful computational tools and machine learning techniques. In this study, we adopt the term time series as an umbrella label for all such sequential data. A central challenge in analyzing time series lies in defining and measuring similarity. Similarity is inherently subjective, shaped by the specific goals and nuances of a given application. The existing literature has produced a rich landscape of similarity measures, each tailored to specific assumptions and use cases.