$K^4$: Online Log Anomaly Detection Via Unsupervised Typicality Learning
Chen, Weicong, Singh, Vikash, Rahmani, Zahra, Ganguly, Debargha, Hariri, Mohsen, Chaudhary, Vipin
–arXiv.org Artificial Intelligence
--Log anomaly detection (LogAD) is crucial for identifying failures and threats in large-scale computing and cyberin-frastructure systems. However, most existing LogAD approaches suffer from key limitations: they depend on slow and error-prone log parsing, employ tightly coupled end-to-end pipelines, often require supervision for improved detection performance, and rely on flawed single-pass evaluation protocols that fail to reflect the temporal dynamics of real-world online detection. These issues significantly hinder scalability, adaptability, and the practical deployment of solutions. These descriptors inform lightweight, modular detectors, including KDE, GMM, OCSVM, and a new adaptation of DeepSVDD, which enables efficient and accurate anomaly scoring without relying on structured formats or log representation retraining. T o support realistic deployment scenarios, we also propose a principled chunk-based evaluation protocol that mimics online log ingestion, alleviates the performance overestimation and dataset undercoverage issues of prior single-pass evaluations, and enables reproducible benchmarking across datasets with varying anomaly densities. Using this setup, we conduct over 125,000 experiments across three real-world datasets (HDFS, BGL, Thunderbird), six pre-trained embedding models, four detectors, and multiple training and log sampling configurations. Logs are essential artifacts in computing systems, recording the operational behavior of applications, kernels, and user activities. This work was supported in part by the NSF research grant #2137603, #2112606, #2117439, and #2320952. These authors contributed equally to this work. With the recent surge in language models and generative AI, a growing body of work [4]-[9] has begun leveraging AI techniques to capture semantic patterns in log sequences, aiming to enable more effective LogAD.
arXiv.org Artificial Intelligence
Jul-29-2025
- Country:
- North America > United States (0.68)
- Genre:
- Research Report (0.82)
- Industry:
- Energy (0.46)
- Technology: