Deep learning performs remarkably well on many time series analysis tasks recently. The superior performance of deep neural networks relies heavily on a large number of training data to avoid overfitting. However, the labeled data of many real-world time series applications may be limited such as classification in medical time series and anomaly detection in AIOps. As an effective way to enhance the size and quality of the training data, data augmentation is crucial to the successful application of deep learning models on time series data. In this paper, we systematically review different data augmentation methods for time series. We propose a taxonomy for the reviewed methods, and then provide a structured review for these methods by highlighting their strengths and limitations. We also empirically compare different data augmentation methods for different tasks including time series anomaly detection, classification and forecasting. Finally, we discuss and highlight future research directions, including data augmentation in time-frequency domain, augmentation combination, and data augmentation and weighting for imbalanced class.
Decomposing complex time series into trend, seasonality, and remainder components is an important task to facilitate time series anomaly detection and forecasting. Although numerous methods have been proposed, there are still many time series characteristics exhibiting in real-world data which are not addressed properly, including 1) ability to handle seasonality fluctuation and shift, and abrupt change in trend and reminder; 2) robustness on data with anomalies; 3) applicability on time series with long seasonality period. In the paper, we propose a novel and generic time series decomposition algorithm to address these challenges. Specifically, we extract the trend component robustly by solving a regression problem using the least absolute deviations loss with sparse regularization. Based on the extracted trend, we apply the the non-local seasonal filtering to extract the seasonality component. This process is repeated until accurate decomposition is obtained. Experiments on different synthetic and real-world time series datasets demonstrate that our method outperforms existing solutions.
Anomaly detection methods abound and are used extensively in streaming settings in a wide variety of domains. But a strength can also be a weakness; given the vast number of methods, how can one select the best method for their application? Unfortunately, there is no one best way for all domains. Existing literature is focused on creating new anomaly detection methods or creating large frameworks for experimenting with multiple methods at the same time. As the literature continues to grow, extensive evaluation of every available anomaly detection method is not feasible. To reduce this evaluation burden, in this paper we present a framework to intelligently choose the optimal anomaly detection methods based on the characteristics the time series displays. We provide a comprehensive experimental validation of multiple anomaly detection methods over different time series characteristics to form guidelines. Applying our framework can save time and effort by surfacing the most promising anomaly detection methods instead of experimenting extensively with a rapidly expanding library of anomaly detection methods.
At eBay, there are thousands of product health metrics for different domain teams to monitor. We built a two-phase alerting system to notify users with actionable alerts based on anomaly detection and alert retrieval. In the first phase, we developed an efficient anomaly detection algorithm, called Moving Metric Detector (MMD), to identify potential alerts among metrics with distribution agnostic criteria. In the second alert retrieval phase, we built additional logic with feedbacks to select valid actionable alerts with point-wise ranking model and business rules. Compared with other trend and seasonality decomposition methods, our decomposer is faster and better to detect anomalies in unsupervised cases. Our two-phase approach dramatically improves alert precision and avoids alert spamming in eBay production.
Since with massive data growth, the need for autonomous and generic anomaly detection system is increased. However, developing one stand-alone generic anomaly detection system that is accurate and fast is still a challenge. In this paper, we propose conventional time-series analysis approaches, the Seasonal Autoregressive Integrated Moving Average (SARIMA) model and Seasonal Trend decomposition using Loess (STL), to detect complex and various anomalies. Usually, SARIMA and STL are used only for stationary and periodic time-series, but by combining, we show they can detect anomalies with high accuracy for data that is even noisy and non-periodic. We compared the algorithm to Long Short Term Memory (LSTM), a deep-learning-based algorithm used for anomaly detection system. We used a total of seven real-world datasets and four artificial datasets with different time-series properties to verify the performance of the proposed algorithm.