AITopics | kubernetes

Collaborating Authors

kubernetes

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Automated Dynamic AI Inference Scaling on HPC-Infrastructure: Integrating Kubernetes, Slurm and vLLM

Trappen, Tim, Keßler, Robert, Pabel, Roland, Achter, Viktor, Wesner, Stefan

arXiv.org Artificial IntelligenceNov-27-2025

Due to rising demands for Artificial Inteligence (AI) inference, especially in higher education, novel solutions utilising existing infrastructure are emerging. The utilisation of High-Performance Computing (HPC) has become a prevalent approach for the implementation of such solutions. However, the classical operating model of HPC does not adapt well to the requirements of synchronous, user-facing dynamic AI application workloads. In this paper, we propose our solution that serves LLMs by integrating vLLM, Slurm and Kubernetes on the supercomputer \textit{RAMSES}. The initial benchmark indicates that the proposed architecture scales efficiently for 100, 500 and 1000 concurrent requests, incurring only an overhead of approximately 500 ms in terms of end-to-end latency.

gateway, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3774902.3776632

2511.21413

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Tennessee > Davidson County > Nashville (0.05)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Cologne (0.05)
(7 more...)

Genre: Research Report (0.70)

Industry:

Information Technology (0.95)
Education > Educational Setting (0.50)

Technology:

Information Technology > Scientific Computing (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
(2 more...)

Add feedback

A CODECO Case Study and Initial Validation for Edge Orchestration of Autonomous Mobile Robots

Zhu, H., Samizadeh, T., Sofia, R. C.

arXiv.org Artificial IntelligenceNov-12-2025

Hongyu Zhu, Tina Samizadeh, Rute C. Sofia fortiss - research Institute of the Free State of Bavaria associated with the Technical University of Munich (TUM) Abstract--Autonomous Mobile Robots (AMRs) increasingly adopt containerized micro-services across the Edge-Cloud continuum. While Kubernetes is the de-facto orchestrator for such systems, its assumptions--stable networks, homogeneous resources, and ample compute capacity do not fully hold in mobile, resource-constrained robotic environments. The paper describes a case-study on smart-manufacturing AMR and performs an initial comparison between CODECO orchestration and standard Kubernetes using a controlled Kubernetes-in-Docker (KinD) environment. Metrics include pod deployment and deletion times, CPU and memory usage, and inter-pod data rates. The observed results indicate that CODECO offers reduced CPU consumption and more stable communication patterns, at the cost of modest memory overhead ( 10-15%) and slightly increased pod lifecycle latency due to secure overlay initialization. Kubernetes provides declarative configuration, automated scaling, and robust availability mechanisms that make it highly effective in cloud data-centers. However, its design assumptions, namely, the existence of relatively stable networks, abundant compute resources, and largely static infrastructure, do not fully hold in Edge-Edge and Edge-Cloud environments. In such settings, resources can be constrained and heterogeneous.

artificial intelligence, cloud computing, codeco, (18 more...)

arXiv.org Artificial Intelligence

2511.08354

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.24)
North America > United States (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.61)

Add feedback

Application Management in C-ITS: Orchestrating Demand-Driven Deployments and Reconfigurations

Zanger, Lukas, Lampe, Bastian, Reiher, Lennart, Eckstein, Lutz

arXiv.org Artificial IntelligenceNov-7-2025

Personal use of this material is permitted. Abstract-- V ehicles are becoming increasingly automated and interconnected, enabling the formation of cooperative intelligent transport systems (C-ITS) and the use of offboard services. As a result, cloud-native techniques, such as microservices and container orchestration, play an increasingly important role in their operation. However, orchestrating applications in a large-scale C-ITS poses unique challenges due to the dynamic nature of the environment and the need for efficient resource utilization. In this paper, we present a demand-driven application management approach that leverages cloud-native techniques - specifically Kubernetes - to address these challenges. T aking into account the demands originating from different entities within the C-ITS, the approach enables the automation of processes, such as deployment, reconfiguration, update, upgrade, and scaling of microservices. Executing these processes on demand can, for example, reduce computing resource consumption and network traffic. A demand may include a request for provisioning an external supporting service, such as a collective environment model. The approach handles changing and new demands by dynamically reconciling them through our proposed application management framework built on Kubernetes and the Robot Operating System (ROS 2). We demonstrate the operation of our framework in the C-ITS use case of collective environment perception and make the source code of the prototypical framework publicly available at https://github.com/

application, artificial intelligence, cloud computing, (17 more...)

arXiv.org Artificial Intelligence

2509.18793

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
Europe > Hungary > Budapest > Budapest (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Transportation (0.49)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Scaling Homomorphic Applications in Deployment

Marinelli, Ryan, Chowdhury, Angelica

arXiv.org Artificial IntelligenceOct-6-2025

In this endeavor, a proof-of-concept homomorphic application is developed to determine the production readiness of encryption ecosystems. A movie recommendation app is implemented for this purpose and productionized through containerization and orchestration. By tuning deployment configurations, the computational limitations of Fully Homomorphic Encryption (FHE) are mitigated through additional infrastructure optimizations.

cloud computing, machine learning, replica, (18 more...)

arXiv.org Artificial Intelligence

2510.02376

Country: Europe > Norway > Eastern Norway > Oslo (0.05)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.96)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Security & Privacy (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

AI Factories: It's time to rethink the Cloud-HPC divide

Lopez, Pedro Garcia, Pons, Daniel Barcelona, Copik, Marcin, Hoefler, Torsten, Quiñones, Eduardo, Malawski, Maciej, Pietzutch, Peter, Marti, Alberto, Timoudas, Thomas Ohlson, Slominski, Aleksander

arXiv.org Artificial IntelligenceSep-17-2025

The strategic importance of artificial intelligence is driving a global push toward Sovereign AI initiatives. Nationwide governments are increasingly developing dedicated infrastructures, called AI Factories (AIF), to achieve technological autonomy and secure the resources necessary to sustain robust local digital ecosystems. In Europe, the EuroHPC Joint Undertaking is investing hundreds of millions of euros into several AI Factories, built atop existing high-performance computing (HPC) supercomputers. However, while HPC systems excel in raw performance, they are not inherently designed for usability, accessibility, or serving as public-facing platforms for AI services such as inference or agentic applications. In contrast, AI practitioners are accustomed to cloud-native technologies like Kubernetes and object storage, tools that are often difficult to integrate within traditional HPC environments. This article advocates for a dual-stack approach within supercomputers: integrating both HPC and cloud-native technologies. Our goal is to bridge the divide between HPC and cloud computing by combining high performance and hardware acceleration with ease of use and service-oriented front-ends. This convergence allows each paradigm to amplify the other. To this end, we will study the cloud challenges of HPC (Serverless HPC) and the HPC challenges of cloud technologies (High-performance Cloud).

data mining, infrastructure, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2509.12849

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York > New York County > New York City (0.05)
Europe > Sweden (0.04)
(23 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Scientific Computing (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

Automated Creation and Enrichment Framework for Improved Invocation of Enterprise APIs as Tools

Agarwal, Prerna, Gupta, Himanshu, Soni, Soujanya, Vallam, Rohith, Sindhgatta, Renuka, Mehta, Sameep

arXiv.org Artificial IntelligenceSep-16-2025

Recent advancements in Large Language Models (LLMs) has lead to the development of agents capable of complex reasoning and interaction with external tools. In enterprise contexts, the effective use of such tools that are often enabled by application programming interfaces (APIs), is hindered by poor documentation, complex input or output schema, and large number of operations. These challenges make tool selection difficult and reduce the accuracy of payload formation by up to 25%. We propose ACE, an automated tool creation and enrichment framework that transforms enterprise APIs into LLM-compatible tools. ACE, (i) generates enriched tool specifications with parameter descriptions and examples to improve selection and invocation accuracy, and (ii) incorporates a dynamic shortlisting mechanism that filters relevant tools at runtime, reducing prompt complexity while maintaining scalability. We validate our framework on both proprietary and open-source APIs and demonstrate its integration with agentic frameworks. To the best of our knowledge, ACE is the first end-to-end framework that automates the creation, enrichment, and dynamic selection of enterprise API tools for LLM agents.

large language model, machine learning, namespace, (21 more...)

arXiv.org Artificial Intelligence

2509.11626

Country:

North America > United States (0.04)
Europe > Norway > Norwegian Sea (0.04)
Asia > Singapore (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

MAIA: A Collaborative Medical AI Platform for Integrated Healthcare Innovation

Bendazzoli, Simone, Persson, Sanna, Astaraki, Mehdi, Pettersson, Sebastian, Grozman, Vitali, Moreno, Rodrigo

arXiv.org Artificial IntelligenceJul-29-2025

Artificial Intelligence (AI) integration in healthcare has emerged as a transfor-mative force, promising to revolutionize patient care, optimize resource allocation, and enhance clinical decision-making [2, 10]. As the healthcare ecosystem increasingly recognizes the importance of AI-powered tools, there is a growing need for collaborative platforms to facilitate the development, deployment, and management of AI solutions in medical settings [7, 13]. Modern healthcare institutions are facing complex challenges that demand sophisticated technological solutions. A comprehensive Medical AI Platform can serve as a powerful foundation for addressing these complex needs, effectively bridging technological capabilities with clinical requirements. One of the open challenges in healthcare is the management of the vast amounts of data handled in clinical settings. Cloud-based medical AI platforms can provide new opportunities for computational resource sharing, enabling institutions to optimize data storage, and collaborative research environments. By creating a unified and standardised ecosystem, these platforms break down traditional institutional barriers, facilitating knowledge exchange between medical professionals, data scientists, and researchers.

artificial intelligence, machine learning, platform, (19 more...)

arXiv.org Artificial Intelligence

2507.19489

Country:

Europe > Sweden > Stockholm > Stockholm (0.05)
North America > Canada (0.04)

Genre:

Workflow (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Applied AI (1.00)

Add feedback

Simplifying Root Cause Analysis in Kubernetes with StateGraph and LLM

Xiang, Yong, Chen, Charley Peter, Zeng, Liyi, Yin, Wei, Liu, Xin, Li, Hu, Xu, Wei

arXiv.org Artificial IntelligenceJun-4-2025

Kubernetes, a notably complex and distributed system, utilizes an array of controllers to uphold cluster management logic through state reconciliation. Nevertheless, maintaining state consistency presents significant challenges due to unexpected failures, network disruptions, and asynchronous issues, especially within dynamic cloud environments. These challenges result in operational disruptions and economic losses, underscoring the necessity for robust root cause analysis (RCA) to enhance Kubernetes reliability. The development of large language models (LLMs) presents a promising direction for RCA. However, existing methodologies encounter several obstacles, including the diverse and evolving nature of Kubernetes incidents, the intricate context of incidents, and the polymorphic nature of these incidents. In this paper, we introduce SynergyRCA, an innovative tool that leverages LLMs with retrieval augmentation from graph databases and enhancement with expert prompts. SynergyRCA constructs a StateGraph to capture spatial and temporal relationships and utilizes a MetaGraph to outline entity connections. Upon the occurrence of an incident, an LLM predicts the most pertinent resource, and SynergyRCA queries the MetaGraph and StateGraph to deliver context-specific insights for RCA. We evaluate SynergyRCA using datasets from two production Kubernetes clusters, highlighting its capacity to identify numerous root causes, including novel ones, with high efficiency and precision. SynergyRCA demonstrates the ability to identify root causes in an average time of about two minutes and achieves an impressive precision of approximately 0.90.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2506.0249

Country:

Asia > China > Beijing > Beijing (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Taming the Memory Beast: Strategies for Reliable ML Training on Kubernetes

Ray, Jaideep

arXiv.org Artificial IntelligenceDec-25-2024

Kubernetes offers a powerful orchestration platform for machine learning training, but memory management can be challenging due to specialized needs and resource constraints. This paper outlines how Kubernetes handles memory requests, limits, Quality of Service classes, and eviction policies for ML workloads, with special focus on GPU memory and ephemeral storage. Common pitfalls such as overcommitment, memory leaks, and ephemeral volume exhaustion are examined. We then provide best practices for stable, scalable memory utilization to help ML practitioners prevent out-of-memory events and ensure high-performance ML training pipelines.

artificial intelligence, cloud computing, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.14701

Genre: Research Report (0.40)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A generic approach for reactive stateful mitigation of application failures in distributed robotics systems deployed with Kubernetes

Mirus, Florian, Pasch, Frederik, Singhal, Nikhil, Scholl, Kay-Ulrich

arXiv.org Artificial IntelligenceNov-4-2024

Offloading computationally expensive algorithms to the edge or even cloud offers an attractive option to tackle limitations regarding on-board computational and energy resources of robotic systems. In cloud-native applications deployed with the container management system Kubernetes (K8s), one key problem is ensuring resilience against various types of failures. However, complex robotic systems interacting with the physical world pose a very specific set of challenges and requirements that are not yet covered by failure mitigation approaches from the cloud-native domain. In this paper, we therefore propose a novel approach for robotic system monitoring and stateful, reactive failure mitigation for distributed robotic systems deployed using Kubernetes (K8s) and the Robot Operating System (ROS2). By employing the generic substrate of Behaviour Trees, our approach can be applied to any robotic workload and supports arbitrarily complex monitoring and failure mitigation strategies. We demonstrate the effectiveness and application-agnosticism of our approach on two example applications, namely Autonomous Mobile Robot (AMR) navigation and robotic manipulation in a simulated environment.

artificial intelligence, cloud computing, workload, (17 more...)

arXiv.org Artificial Intelligence

2410.18825

Country:

North America > United States (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback