Jain, Brijnesh
Making the Dynamic Time Warping Distance Warping-Invariant
Jain, Brijnesh
Computing proximities is a core task for many important time-series mining tasks such as similarity search, nearest-neighbor classification, and clustering. The choice of proximity measure affects the performance of the corresponding proximity-based method. Consequently, a plethora of task-specific proximity measures have been devised [1, 3]. One research direction in time series mining is devoted to constructing data-specific distance functions that can capture some of the data's intrinsic structure. For example, the dynamic time warping (dtw) distance has been devised to account for local variations in speed [39, 31]. Further examples include distances and techniques that account for other variations such as variation in speed, amplitude, offset, endpoints, occlusion, complexity, and mixtures thereof [5]. In some - partly influential - work, the different distances are ascribed as invariant under the respective variation. For example, the dtw-distance is considered as warping-invariant (invariant under variation in speed) [5, 10, 11, 18, 27, 35], the CIDdistance [5] as complexity-invariant, and the ฯ-dtw distance [35] as endpoint-invariant.
Comparing Temporal Graphs Using Dynamic Time Warping
Froese, Vincent, Jain, Brijnesh, Niedermeier, Rolf, Renken, Malte
The connections within many real-world networks change over time. Thus, there has been a recent boom in studying temporal graphs. Recognizing patterns in temporal graphs requires a similarity measure to compare different temporal graphs. To this end, we initiate the study of dynamic time warping (an established concept for mining time series data) on temporal graphs. We propose the dynamic temporal graph warping distance (dtgw) to determine the (dis-)similarity of two temporal graphs. Our novel measure is flexible and can be applied in various application domains. We show that computing the dtgw-distance is a challenging (NP-hard) optimization problem and identify some polynomial-time solvable special cases. Moreover, we develop a quadratic programming formulation and an efficient heuristic. Preliminary experiments indicate that the heuristic performs very well and that our concept yields meaningful results on real-world instances.
Revisiting Inaccuracies of Time Series Averaging under Dynamic Time Warping
Jain, Brijnesh
This article revisits an analysis on inaccuracies of time series averaging under dynamic time warping conducted by \cite{Niennattrakul2007}. The authors presented a correctness-criterion and introduced drift-outs of averages from clusters. They claimed that averages are inaccurate if they are incorrect or drift-outs. Furthermore, they conjectured that such inaccuracies are caused by the lack of triangle inequality. We show that a rectified version of the correctness-criterion is unsatisfiable and that the concept of drift-out is geometrically and operationally inconclusive. Satisfying the triangle inequality is insufficient to achieve correctness and unnecessary to overcome the drift-out phenomenon. We place the concept of drift-out on a principled basis and show that sample means as global minimizers of a Fr\'echet function never drift out. The adjusted drift-out is a way to test to which extent an approximation is coherent. Empirical results show that solutions obtained by the state-of-the-art methods SSG and DBA are incoherent approximations of a sample mean in over a third of all trials.
Asymmetric Learning Vector Quantization for Efficient Nearest Neighbor Classification in Dynamic Time Warping Spaces
Jain, Brijnesh, Schultz, David
The nearest neighbor method together with the dynamic time warping (DTW) distance is one of the most popular approaches in time series classification. This method suffers from high storage and computation requirements for large training sets. As a solution to both drawbacks, this article extends learning vector quantization (LVQ) from Euclidean spaces to DTW spaces. The proposed LVQ scheme uses asymmetric weighted averaging as update rule. Empirical results exhibited superior performance of asymmetric generalized LVQ (GLVQ) over other state-of-the-art prototype generation methods for nearest neighbor classification.
Asymptotic Behavior of Mean Partitions in Consensus Clustering
Jain, Brijnesh
Although consistency is a minimum requirement of any estimator, little is known about consistency of the mean partition approach in consensus clustering. This contribution studies the asymptotic behavior of mean partitions. We show that under normal assumptions, the mean partition approach is consistent and asymptotic normal. To derive both results, we represent partitions as points of some geometric space, called orbit space. Then we draw on results from the theory of Fr\'echet means and stochastic programming. The asymptotic properties hold for continuous extensions of standard cluster criteria (indices). The results justify consensus clustering using finite but sufficiently large sample sizes. Furthermore, the orbit space framework provides a mathematical foundation for studying further statistical, geometrical, and analytical properties of sets of partitions.
A Necessary and Sufficient Condition for Graph Matching Being Equivalent to the Maximum Weight Clique Problem
Jain, Brijnesh, Obermayer, Klaus
This paper formulates a necessary and sufficient condition for a generic graph matching problem to be equivalent to the maximum vertex and edge weight clique problem in a derived association graph. The consequences of this results are threefold: first, the condition is general enough to cover a broad range of practical graph matching problems; second, a proof to establish equivalence between graph matching and clique search reduces to showing that a given graph matching problem satisfies the proposed condition; and third, the result sets the scene for generic continuous solutions for a broad range of graph matching problems. To illustrate the mathematical framework, we apply it to a number of graph matching problems, including the problem of determining the graph edit distance.