Collaborating Authors

RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series Machine Learning

Decomposing complex time series into trend, seasonality, and remainder components is an important task to facilitate time series anomaly detection and forecasting. Although numerous methods have been proposed, there are still many time series characteristics exhibiting in real-world data which are not addressed properly, including 1) ability to handle seasonality fluctuation and shift, and abrupt change in trend and reminder; 2) robustness on data with anomalies; 3) applicability on time series with long seasonality period. In the paper, we propose a novel and generic time series decomposition algorithm to address these challenges. Specifically, we extract the trend component robustly by solving a regression problem using the least absolute deviations loss with sparse regularization. Based on the extracted trend, we apply the the non-local seasonal filtering to extract the seasonality component. This process is repeated until accurate decomposition is obtained. Experiments on different synthetic and real-world time series datasets demonstrate that our method outperforms existing solutions.

Forecasting with Multiple Seasonality Machine Learning

An emerging number of modern applications involve forecasting time series data that exhibit both short-time dynamics and long-time seasonality. Specifically, time series with multiple seasonality is a difficult task with comparatively fewer discussions. In this paper, we propose a two-stage method for time series with multiple seasonality, which does not require pre-determined seasonality periods. In the first stage, we generalize the classical seasonal autoregressive moving average (ARMA) model in multiple seasonality regime. In the second stage, we utilize an appropriate criterion for lag order selection. Simulation and empirical studies show the excellent predictive performance of our method, especially compared to a recently popular `Facebook Prophet' model for time series.

Don't let missing values ruin your analysis output, Deal with them!


Missing values or their replacement values can lead to huge errors in your analysis output wheter it is a machine learning model, KPIs or a report. Missing values or their replacement values can lead to huge errors in your analysis output wheter it is a machine learning model, KPIs or a report. Often analysts deal with missing values just like there is only one type of them. It is not the case, there is three types of missing values and there is ways of dealing with0 each one of them. Missing at random (MAR): The presence of a null value in a variable is not random but rather dependent of a known or unknown characteristic of the record.

Feature-based time series analysis


I used this example in my talk at useR!2019 in Toulouse, and it is also the basis of a vignette in the package, and a recent blog post by Mitchell O'Hara-Wild. The data set contains domestic tourist visitor nights in Australia, disaggregated by State, Region and Purpose. An example of a feature would be the autocorrelation function at lag 1 -- it is a numerical summary capturing some aspect of the time series. Autocorrelations at other lags are also features, as are the autocorrelations of the first differenced series, or the seasonally differenced series, etc. Values close to 1 indicate a highly seasonal time series, while values close to 0 indicate a time series with little seasonality.

Time series modeling with Facebook Prophet


When trying to understand time series, there's so much to think about. Is it affected by seasonality? What kind of model should I use, and how well will it perform? All these questions can make time series modeling kind of intimidating, but it doesn't have to be that bad. While working on a project for my data science bootcamp recently, I tried Facebook Prophet, an open-source package for time series modeling developed by … y'know, Facebook.