disk usage
How Artificial Intelligence Protects Porsche's IT Landscape
Porsche, iTUBS and comNET are working on an ongoing research project to develop the first AI-supported IT monitoring tool that can automatically detect complex error cases. The development of this tool will enable Porsche to monitor and respond effectively to alerts across all its IT systems and services. Robust, powerful and healthy IT systems play an increasingly important role in today's digital age. If IT systems are not functioning as they should, it can have a devastating impact on a company and seriously disrupt business operations. IT monitoring -- collecting measurement data and monitoring the IT environment -- is an effective way to improve the health and resilience of IT systems.
Optimizing Alerts on Free space on disks using Machine Learning - OpsClarity
The available space on the disk (diskfree) has a significant and often catastrophic impact on applications and services running on the system. For this reason, every DevOps engineer knows that it is crucial to carefully monitor disk usage in all critical systems, especially ones that tend to rapidly use up disk space, such as heavily used Hadoop stores, applications with extensive logging, Kafka clusters with a long retention period, etc. The most common monitors used for diskfree metrics rely on a static threshold where the threshold is set by a DevOps engineer with intricate knowledge of the system and applications running on the system. For example, a DevOps engineer may choose to set a static threshold at 5%, i.e., the monitor will trigger an alert if diskfree falls below 5%. In our experience, this approach is inefficient for several reasons.