SQL to SARIMAX: How I navigate the first time-series analysis personal project for my portfolio
The diagnostics plot for this particular model shows a decently good fit . When being used for prediction, it followed the real trend closely. And since our focus is on the estimates/coefficients of the bool_promotion variable, I considered this model good enough to be used in our analysis. As we can see from the model summary, our bool_promotion variable is significant, meaning it's showed to affect sales of grocery I at store 1, and in this case, positively. Having promotions added more than 500 units to the sales for this given combination. Having figured out the pipeline throughout these steps, I automated this process for other store-city-product combinations with auto_arima(), which helps us identify the best fit set of orders, record these orders, as well as coefficients. First, I created a helper function to identify the necessary parameters and train the auto_arima(). One parameter that appeared tricky to me was parameter m, which is the period for seasonal differencing.
Apr-8-2022, 21:39:23 GMT
- Country:
- South America > Ecuador
- Pichincha Province > Quito (0.05)
- Guayas Province > Guayaquil (0.05)
- Chimborazo Province > Riobamba (0.05)
- North America > Trinidad and Tobago
- South America > Ecuador
- Technology: