AIOps Solutions for Incident Management: Technical Guidelines and A Comprehensive Literature Review
Remil, Youcef, Bendimerad, Anes, Mathonat, Romain, Kaytoue, Mehdi
–arXiv.org Artificial Intelligence
The management of modern IT systems poses unique challenges, necessitating scalability, reliability, and efficiency in handling extensive data streams. Traditional methods, reliant on manual tasks and rule-based approaches, prove inefficient for the substantial data volumes and alerts generated by IT systems. Artificial Intelligence for Operating Systems (AIOps) has emerged as a solution, leveraging advanced analytics like machine learning and big data to enhance incident management. AIOps detects and predicts incidents, identifies root causes, and automates healing actions, improving quality and reducing operational costs. However, despite its potential, the AIOps domain is still in its early stages, decentralized across multiple sectors, and lacking standardized conventions. Research and industrial contributions are distributed without consistent frameworks for data management, target problems, implementation details, requirements, and capabilities. This study proposes an AIOps terminology and taxonomy, establishing a structured incident management procedure and providing guidelines for constructing an AIOps framework. The research also categorizes contributions based on criteria such as incident management tasks, application areas, data sources, and technical approaches. The goal is to provide a comprehensive review of technical and research aspects in AIOps for incident management, aiming to structure knowledge, identify gaps, and establish a foundation for future developments in the field.
arXiv.org Artificial Intelligence
Apr-1-2024
- Country:
- Oceania > Australia
- North America
- United States (0.14)
- Trinidad and Tobago > Trinidad
- Canada
- Quebec > Montreal (0.04)
- British Columbia (0.04)
- Europe
- Greece (0.04)
- Germany > Berlin (0.04)
- Spain > Basque Country
- Biscay Province > Bilbao (0.04)
- Portugal > Porto
- Porto (0.04)
- Norway > Central Norway
- France > Auvergne-Rhône-Alpes
- Asia
- China (0.04)
- Middle East
- Jordan (0.04)
- Iran > Tehran Province
- Tehran (0.04)
- Japan > Honshū
- Kantō > Ibaraki Prefecture > Tsukuba (0.04)
- India > Telangana
- Hyderabad (0.04)
- Genre:
- Overview (1.00)
- Research Report
- Promising Solution (1.00)
- Experimental Study (1.00)
- New Finding (0.92)
- Industry:
- Health & Medicine (1.00)
- Government > Military (0.92)
- Education (0.67)
- Information Technology
- Security & Privacy (1.00)
- Services (0.92)
- Software (0.92)
- Technology:
- Information Technology
- Software > Programming Languages (1.00)
- Information Management (1.00)
- Communications > Networks (1.00)
- Data Science
- Data Quality (1.00)
- Data Mining > Big Data (1.00)
- Artificial Intelligence
- Cognitive Science (1.00)
- Representation & Reasoning
- Expert Systems (1.00)
- Uncertainty > Bayesian Inference (0.93)
- Diagnosis (0.92)
- Rule-Based Reasoning (0.86)
- Natural Language
- Text Processing (1.00)
- Information Retrieval (1.00)
- Machine Learning
- Statistical Learning > Clustering (1.00)
- Performance Analysis > Accuracy (1.00)
- Evolutionary Systems (0.92)
- Neural Networks
- Deep Learning (1.00)
- Perceptrons (0.67)
- Learning Graphical Models
- Undirected Networks > Markov Models (1.00)
- Directed Networks > Bayesian Learning (1.00)
- Information Technology