Enhancing Cloud Security through Topic Modelling

Saleh, Sabbir M., Madhavji, Nazim, Steinbacher, John

Oct-21-2025–arXiv.org Artificial Intelligence

Protecting cloud applications is critical in an era where security threats are increasingly sophisticated and persistent. Continuous Integration and Continuous Deployment (CI/CD) pipelines are particularly vulnerable, making innovative security approaches essential. This research explores the application of Natural Language Processing (NLP) techniques, specifically Topic Modelling, to analyse security-related text data and anticipate potential threats. We focus on Latent Dirichlet Allocation (LDA) and Probabilistic Latent Semantic Analysis (PLSA) to extract meaningful patterns from data sources, including logs, reports, and deployment traces. Using the Gensim framework in Python, these methods categorise log entries into security-relevant topics (e.g., phishing, encryption failures). The identified topics are leveraged to highlight patterns indicative of security issues across CI/CD's continuous stages (build, test, deploy). This approach introduces a semantic layer that supports early vulnerability recognition and contextual understanding of runtime behaviours.

artificial intelligence, ci cd pipeline, natural language, (17 more...)

arXiv.org Artificial Intelligence

Oct-21-2025

arXiv.org PDF

Add feedback

Country:
- North America > Canada > Ontario (0.14)

Genre:
- Research Report (0.82)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence > Natural Language
    - Text Processing (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found