It's hard to visit a tech website these days without reading an article about how artificial intelligence (AI) is poised to uproot entire industries and workflows. As it turns out, IT systems operations fall under the umbrella of things that are likely to change or already have due to AI. That shift created a new IT category called algorithmic IT operations -- or AIOps for short. The basic definition of AIOps is that it involves using artificial intelligence and machine learning to support all primary IT operations. The goal is to turn the data generated by IT systems platforms into meaningful insights.
IT organizations that identify and understand patterns in vast, diverse sets of data are best equipped to find, fix, and prevent performance problems. But when digital transformation outpaces IT performance management, hybrid and multi-cloud infrastructure creates complexity--and that comes with a hefty price tag. That's where artificial intelligence in IT operations (AIOps) comes into the picture. Gartner describes AIOps platforms as software systems that combine big data and artificial intelligence (AI) or machine-learning functionality "to enhance and partially replace all primary IT operations functions, including availability and performance monitoring, event correlation and analysis, and IT service management, and automation." AIOps is the application of machine learning and data science to IT operations to make them more efficient.
Organizations around the world are turning to technology to revolutionize their businesses. This requires a rethinking of the entire IT stack and operations strategy. In their 2019 report, Gartner lists several revolutionized technologies and delves into how those tools and solutions will challenge IT organizations from an operations and management standpoint. Though the underlying application and service delivery stack has gone through massive transformation with virtualization, the cloud, microservices, containers and more, even today, the primary IT operations tools organizations use are relatively old. Critical IT operations activities, such as migration, upgrades, patch management, capacity planning, service and change management are becoming extremely challenging due to the volume and varieties of data that exist across the IT stack.
In today's hybrid blend of on-premises and virtual servers, public and private cloud platforms, and mobile devices of all types, it is more critical than ever to identify, resolve, and mitigate potential outages or performance issues. And these issues are now more difficult to detect and identify, much less remediate. Both the volume and variety of network and application traffic that data enterprises are generating have skyrocketed. To effectively monitor and manage IT operating at this level of complexity, organizations are turning to artificial intelligence for IT operations, or what's called AIOps. The term AIOps describes using artificial intelligence (AI), machine learning (ML), and other advanced data analytics technologies to automate the processes of identifying and resolving IT performance issues.
This post is about how AIOps will change the way IT Operations personnel (IT Ops) work and the new skill sets they have to adopt in an AIOps world. For a definition of AIOps, refer to the blog post: "What is AIOps?" Gartner explains that an AIOps platform (figure 1) uses machine learning and big data to aggregate observational data (from monitoring systems output, job logs, syslogs, etc.) and engagement data (from ticketing, incident, and event recording system data) to produce a virtuous circle of continuous insights yielding continuous improvements and fixes. Automation is both an input and output of AIOps. The results or statuses of automated workloads and jobs can be used like operational data and engagement data for analytic purposes.