T-Rep: Representation Learning for Time Series using Time-Embeddings

Fraikin, Archibald, Bennetot, Adrien, Allassonnière, Stéphanie

arXiv.org Artificial Intelligence 

Let it Care PariSanté Campus, Paris, France PariSanté Campus, Paris, France archibald.fraikin@inria.fr Multivariate time series present challenges to standard machine learning techniques, as they are often unlabeled, high dimensional, noisy, and contain missing data. To address this, we propose T-Rep, a self-supervised method to learn time series representations at a timestep granularity. T-Rep learns vector embeddings of time alongside its feature extractor, to extract temporal features such as trend, periodicity, or distribution shifts from the signal. These time-embeddings are leveraged in pretext tasks, to incorporate smooth and fine-grained temporal dependencies in the representations, as well as reinforce robustness to missing data. We evaluate T-Rep on downstream classification, forecasting, and anomaly detection tasks. It is compared to existing self-supervised algorithms for time series, which it outperforms in all three tasks. We test T-Rep in missing data regimes, where it proves more resilient than its counterparts. Finally, we provide latent space visualisation experiments, highlighting the interpretability of the learned representations. Multivariate time series have become ubiquitous in domains such as medicine, climate science, or finance. Unfortunately, they are high-dimensional and complex objects with little data being labeled (Yang & Wu, 2006), as it is an expensive and time-consuming process. Leveraging unlabeled data to build unsupervised representations of multivariate time series has thus become a challenge of great interest, as these embeddings can significantly improve performance in tasks like forecasting, classification, or anomaly detection (Deldari et al., 2021; Su et al., 2019). This has motivated the development of self-supervised learning (SSL) models for time series, first focusing on constructing instance-level representations for classification and clustering (Tonekaboni et al., 2021; Franceschi et al., 2019; Wu et al., 2018). More fine-grained representations were then developed to model time series at the timestep-level (Yue et al., 2022), which is key in domains such as healthcare or sensor systems. With fine-grained embeddings, one can capture subtle changes, periodic patterns, and irregularities that are essential for anomaly detection (Keogh et al., 2006) as well as understanding and forecasting disease progression.