I'm beyond excited to introduce modeltime, a new time series forecasting package designed to speed up model evaluation, selection, and forecasting. Follow the updated modeltime article to get started with modeltime. If you like what you see, I have an Advanced Time Series Course where you will become the time-series expert for your organization by learning modeltime and timetk. This article is part of a series of software announcements on the Modeltime Forecasting Ecosystem. Register to stay in the know on new cutting-edge R software like modeltime.
Go to R-bloggers for R news and tutorials contributed by hundreds of R bloggers. This is the third of a series of 6 articles about time series forecasting with panel data and ensemble stacking with R. Through these articles I will be putting into practice what I have learned from the Business Science University training course 2 DS4B 203-R: High-Performance Time Series Forecasting", delivered by Matt Dancho. If you are looking to gain a high level of expertise in time series with R I strongly recommend this course. The objective of this article is to show how do we fit machine learning models for time series with modeltime. Modeltime is used to integrate time series models ino the tydimodels ecosystem. You will understand the notion of forecasting workflows e.g., how to fit a model by adding its specification and corresponding preprocessing recipe (see Part 2) to a workflow. The notion of modeltime table and calibration table will also be very useful since it allows to evaluate and forecast all models at the same time for all time series (panel data). Finally, you will perform and plot forecasts on test dataset. Hyperparameter tuning will be covered in the next article (Part 4). Let us load our work from Part 2. As per workflows, "A workflow is an object that can bundle together your pre-processing, modeling, and post-processing requests.
A machine learning algorithm is developed to forecast the CO2 emission intensities in electrical power grids in the Danish bidding zone DK2, distinguishing between average and marginal emissions. The analysis was done on data set comprised of a large number (473) of explanatory variables such as power production, demand, import, weather conditions etc. collected from selected neighboring zones. The number was reduced to less than 50 using both LASSO (a penalized linear regression analysis) and a forward feature selection algorithm. Three linear regression models that capture different aspects of the data (non-linearities and coupling of variables etc.) were created and combined into a final model using Softmax weighted average. Cross-validation is performed for debiasing and autoregressive moving average model (ARIMA) implemented to correct the residuals, making the final model the variant with exogenous inputs (ARIMAX). The forecasts with the corresponding uncertainties are given for two time horizons, below and above six hours. Marginal emissions came up independent of any conditions in the DK2 zone, suggesting that the marginal generators are located in the neighbouring zones. The developed methodology can be applied to any bidding zone in the European electricity network without requiring detailed knowledge about the zone.
I'm super excited to introduce the new panel data forecasting functionality in modeltime. Just say NO to for-loops for forecasting. Fitting many time series can be an expensive process. The most widely-accepted technique is to iteratively run an ARIMA model on each time series in a for-loop. Organizations now need 1000's of forecasts.
Multiple seasonal patterns play a key role in time series forecasting, especially for business time series where seasonal effects are often dramatic. Previous approaches including Fourier decomposition, exponential smoothing, and seasonal autoregressive integrated moving average (SARIMA) models do not reflect the distinct characteristics of each period in seasonal patterns, such as the unique behavior of specific days of the week in business data. We propose a multi-dimensional hierarchical model. Intermediate parameters for each seasonal period are first estimated, and a mixture of intermediate parameters is then taken, resulting in a model that successfully reflects the interactions between multiple seasonal patterns. Although this process reduces the data available for each parameter, a robust estimation can be obtained through a hierarchical Bayesian model implemented in Stan. Through this model, it becomes possible to consider both the characteristics of each seasonal period and the interactions among characteristics from multiple seasonal periods. Our new model achieved considerable improvements in prediction accuracy compared to previous models, including Fourier decomposition, which Prophet uses to model seasonality patterns. A comparison was performed on a real-world dataset of pallet transport from a national-scale logistic network.