Prepared for the Unknown: Adapting AIOps Capacity Forecasting Models to Data Changes
Poenaru-Olaru, Lorena, Hof, Wouter van 't, Stando, Adrian, Trawinski, Arkadiusz P., Kapel, Eileen, Rellermeyer, Jan S., Cruz, Luis, van Deursen, Arie
–arXiv.org Artificial Intelligence
Abstract--Capacity management is critical for software organizations to allocate resources effectively and meet operational demands. An important step in capacity management is predicting future resource needs often relies on data-driven analytics and machine learning (ML) forecasting models, which require frequent retraining to stay relevant as data evolves. Continuously retraining the forecasting models can be expensive and difficult to scale, posing a challenge for engineering teams tasked with balancing accuracy and efficiency. Retraining only when the data changes appears to be a more computationally efficient alternative, but its impact on accuracy requires further investigation. In this work, we investigate the effects of retraining capacity forecasting models for time series based on detected changes in the data compared to periodic retraining. Our results show that drift-based retraining achieves comparable forecasting accuracy to periodic retraining in most cases, making it a cost-effective strategy. However, in cases where data is changing rapidly, periodic retraining is still preferred to maximize the forecasting accuracy. These findings offer actionable insights for software teams to enhance forecasting systems, reducing retraining overhead while maintaining robust performance. The term capacity management refers to ensuring that an IT service has sufficient infrastructure and resources to meet the current or future demand. Although capacity management is crucial to ensure efficient and effective service delivery, this process used to be carried on manually by continuously collecting and analyzing data [32]. Manual techniques to predict the capacity requirements become difficult to scale as the capacity management data sources increase, and it is significantly time-consuming for the engineers in charge. To automate the capacity management for machine utilization, like CPU and memory, companies have started employing forecasting AIOps models, which predict the resource demand in a timely fashion. This is particularly relevant for our industry partner, ING (International Netherlands Group) Bank, where operational engineers must monitor numerous time series to ensure sufficient resources are allocated for its large-scale online operations, supported by thousands of machines with varying resource demands.
arXiv.org Artificial Intelligence
Oct-14-2025
- Country:
- Europe
- Germany > Lower Saxony
- Hanover (0.04)
- Netherlands
- North Holland > Amsterdam (0.04)
- South Holland > Delft (0.04)
- Poland > Masovia Province
- Warsaw (0.04)
- Romania > Sud - Muntenia Development Region
- Giurgiu County > Giurgiu (0.04)
- Germany > Lower Saxony
- North America > United States (0.14)
- Europe
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Information Technology > Services (0.88)
- Technology:
- Information Technology
- Artificial Intelligence > Machine Learning (1.00)
- Data Science > Data Mining (1.00)
- Modeling & Simulation (1.00)
- Information Technology