Positional Encoding in Transformer-Based Time Series Models: A Survey
Irani, Habib, Metsis, Vangelis
–arXiv.org Artificial Intelligence
As a result, machine learning-based approaches, particularly Recent advancements in transformer-based models Recurrent Neural Networks (RNNs) and Convolutional have greatly improved time series analysis, providing Neural Networks (CNNs), have gained popularity robust solutions for tasks such as forecasting, due to their ability to model complex temporal anomaly detection, and classification. A crucial element dynamics [4, 24]. of these models is positional encoding, which RNNs, including their more advanced variants like allows transformers to capture the intrinsic sequential Long Short-Term Memory (LSTM) and Gated Recurrent nature of time series data. This survey systematically Units (GRU), excel at modeling sequential data examines existing techniques for positional encoding by maintaining a hidden state that captures information in transformer-based time series models. We investigate from previous time steps. These architectures a variety of methods, including fixed, learnable, offer several advantages for time series analysis: they relative, and hybrid approaches, and evaluate naturally handle irregular time intervals and missing their effectiveness in different time series classification data points through their sequential processing, tasks. Furthermore, we outline key challenges excel at capturing local temporal patterns through and suggest potential research directions to enhance their recurrent connections, and exhibit a beneficial positional encoding strategies. By delivering a comprehensive "recency bias" where recent time steps are weighted overview and quantitative benchmarking, more heavily than distant ones--a characteristic particularly this survey intends to assist researchers and practitioners valuable in applications like financial forecasting in selecting and designing effective positional and weather prediction. However, RNNs suffer encoding methods for transformer-based time series from inherent limitations such as vanishing and models. The source code for the methods and experiments exploding gradients, making it difficult to learn dependencies discussed in this survey is available on over long time horizons [22].
arXiv.org Artificial Intelligence
Feb-17-2025
- Country:
- Genre:
- Overview (1.00)
- Technology: