Goto

Collaborating Authors

 cloud resource


Fair Resource Allocation for Fleet Intelligence

arXiv.org Artificial Intelligence

--Resource allocation is crucial for the performance optimization of cloud-assisted multi-agent intelligence. Traditional methods often overlook agents' diverse computational capabilities and complex operating environments, leading to inefficient and unfair resource distribution. T o address this, we open-sourced Fair-Synergy, an algorithmic framework that utilizes the concave relationship between the agents' accuracy and the system resources to ensure fair resource allocation across fleet intelligence. We extend traditional allocation approaches to encompass a multidimensional machine learning utility landscape defined by model parameters, training data volume, and task complexity. We evaluate Fair-Synergy with advanced vision and language models such as BERT, VGG16, MobileNet, and ResNets on datasets including MNIST, CIF AR-10, CIF AR-100, BDD, and GLUE. We demonstrate that Fair-Synergy outperforms standard benchmarks by up to 25% in multi-agent inference and 11% in multi-agent learning settings. Also, we explore how the level of fairness affects the least advantaged, most advantaged, and average agents, providing insights for equitable fleet intelligence.


Cloud Infrastructure Management in the Age of AI Agents

arXiv.org Artificial Intelligence

Cloud infrastructure is the cornerstone of the modern IT industry. However, managing this infrastructure effectively requires considerable manual effort from the DevOps engineering team. We make a case for developing AI agents powered by large language models (LLMs) to automate cloud infrastructure management tasks. In a preliminary study, we investigate the potential for AI agents to use different cloud/user interfaces such as software development kits (SDK), command line interfaces (CLI), Infrastructure-as-Code (IaC) platforms, and web portals. We report takeaways on their effectiveness on different management tasks, and identify research challenges and potential solutions.


AI Enhanced Ontology Driven NLP for Intelligent Cloud Resource Query Processing Using Knowledge Graphs

arXiv.org Artificial Intelligence

The conventional resource search in cloud infrastructure relies on keyword-based searches or GUIDs, which demand exact matches and significant user effort to locate resources. These conventional search approaches often fail to interpret the intent behind natural language queries, making resource discovery inefficient and inaccessible to users. Though there exists some form of NLP based search engines, they are limited and focused more on analyzing the NLP query itself and extracting identifiers to find the resources. But they fail to search resources based on their behavior or operations or their capabilities or relationships or features or business relevance or the dynamic changing state or the knowledge these resources have. The search criteria has been changing with the inundation of AI based services which involved discovering not just the requested resources and identifiers but seeking insights. The real intent of a search has never been to just to list the resources but with some actual context such as to understand causes of some behavior in the system, compliance checks, capacity estimations, network constraints, or troubleshooting or business insights. This paper proposes an advanced Natural Language Processing (NLP) enhanced by ontology-based semantics to enable intuitive, human-readable queries which allows users to actually discover the intent-of-search itself. By constructing an ontology of cloud resources, their interactions, and behaviors, the proposed framework enables dynamic intent extraction and relevance ranking using Latent Semantic Indexing (LSI) and AI models. It introduces an automated pipeline which integrates ontology extraction by AI powered data crawlers, building a semantic knowledge base for context aware resource discovery.


ABACUS: A FinOps Service for Cloud Cost Optimization

arXiv.org Artificial Intelligence

In recent years, as more enterprises have moved their infrastructure to the cloud, significant challenges have emerged in achieving holistic cloud spend visibility and cost optimization. FinOps practices provide a way for enterprises to achieve these business goals by optimizing cloud costs and bringing accountability to cloud spend. This paper presents ABACUS - Automated Budget Analysis and Cloud Usage Surveillance, a FinOps solution for optimizing cloud costs by setting budgets, enforcing those budgets through blocking new deployments, and alerting appropriate teams if spending breaches a budget threshold. ABACUS also leverages best practices like Infrastructure-as-Code to alert engineering teams of the expected cost of deployment before resources are deployed in the cloud. Finally, future research directions are proposed to advance the state of the art in this important field.


Application of Machine Learning Optimization in Cloud Computing Resource Scheduling and Management

arXiv.org Artificial Intelligence

In recent years, cloud computing has been widely used. Cloud computing refers to the centralized computing resources, users through the access to the centralized resources to complete the calculation, the cloud computing center will return the results of the program processing to the user. Cloud computing is not only for individual users, but also for enterprise users. By purchasing a cloud server, users do not have to buy a large number of computers, saving computing costs. According to a report by China Economic News Network, the scale of cloud computing in China has reached 209.1 billion yuan. At present, the more mature cloud service providers in China are Ali Cloud, Baidu Cloud, Huawei Cloud and so on. Therefore, this paper proposes an innovative approach to solve complex problems in cloud computing resource scheduling and management using machine learning optimization techniques. Through in-depth study of challenges such as low resource utilization and unbalanced load in the cloud environment, this study proposes a comprehensive solution, including optimization methods such as deep learning and genetic algorithm, to improve system performance and efficiency, and thus bring new breakthroughs and progress in the field of cloud computing resource management.Rational allocation of resources plays a crucial role in cloud computing. In the resource allocation of cloud computing, the cloud computing center has limited cloud resources, and users arrive in sequence. Each user requests the cloud computing center to use a certain number of cloud resources at a specific time.


Graph-PHPA: Graph-based Proactive Horizontal Pod Autoscaling for Microservices using LSTM-GNN

arXiv.org Artificial Intelligence

Microservice-based architecture has become prevalent for cloud-native applications. With an increasing number of applications being deployed on cloud platforms every day leveraging this architecture, more research efforts are required to understand how different strategies can be applied to effectively manage various cloud resources at scale. A large body of research has deployed automatic resource allocation algorithms using reactive and proactive autoscaling policies. However, there is still a gap in the efficiency of current algorithms in capturing the important features of microservices from their architecture and deployment environment, for example, lack of consideration of graphical dependency. To address this challenge, we propose Graph-PHPA, a graph-based proactive horizontal pod autoscaling strategy for allocating cloud resources to microservices leveraging long short-term memory (LSTM) and graph neural network (GNN) based prediction methods. We evaluate the performance of Graph-PHPA using the Bookinfo microservices deployed in a dedicated testing environment with real-time workloads generated based on realistic datasets. We demonstrate the efficacy of Graph-PHPA by comparing it with the rule-based resource allocation scheme in Kubernetes as our baseline. Extensive experiments have been implemented and our results illustrate the superiority of our proposed approach in resource savings over the reactive rule-based baseline algorithm in different testing scenarios.


Spot by NetApp Announces Continuous Security Solution for Cloud Infrastructure

#artificialintelligence

NetApp a global, cloud-led, data-centric software company, announced the general availability of Spot Security. Built for the cloud, Spot Security delivers a solution for continuous assessment and analysis of cloud security posture. Spot Security enables DevOps and SecOps teams to easily collaborate to identify misconfigurations, reduce their potential attack surface, and ensure compliance. Spot Security's agentless technology analyzes cloud resource relationships to provide clear visibility and prioritized actions, automatically determining the prospective exposure of each cloud resource and surfacing critical security threats based on their potential impact to the organization. These automated actions mitigate alert fatigue and keep cloud infrastructure secure and operations teams efficient.


Embedded AI: Rise of the intelligent device

#artificialintelligence

Artificial intelligence (AI) is generally viewed in terms of a big computing solution as it makes the leap from the lab to production environments. In the public consciousness, AI is complex algorithms crunching vast amounts of data drawn from hyperscale cloud resources and all of this will create profound, transformative changes to business processes and models. Lately, however, a different form of AI has emerged: narrower in focus individually and less broad in reach. It's called embedded AI and because it exists on the device, SoC or even the processor itself it is by nature broadly distributed, particularly out on the edge. This gives it the potential to be an even more significant advancement than enterprise AI, supporting life-changing applications ranging from autonomous vehicles to the metaverse.


Beyond Desktop Computation: Challenges in Scaling a GPU Infrastructure

arXiv.org Artificial Intelligence

Enterprises and labs performing computationally expensive data science applications sooner or later face the problem of scale but unconnected infrastructure. For this up-scaling process, an IT service provider can be hired or in-house personnel can attempt to implement a software stack. The first option can be quite expensive if it is just about connecting several machines. For the latter option often experience is missing with the data science staff in order to navigate through the software jungle. In this technical report, we illustrate the decision process towards an on-premises infrastructure, our implemented system architecture, and the transformation of the software stack towards a scaleable GPU cluster system.


Cloud Failure Prediction with Hierarchical Temporary Memory: An Empirical Assessment

arXiv.org Artificial Intelligence

Hierarchical Temporary Memory (HTM) is an unsupervised learning algorithm inspired by the features of the neocortex that can be used to continuously process stream data and detect anomalies, without requiring a large amount of data for training nor requiring labeled data. HTM is also able to continuously learn from samples, providing a model that is always up-to-date with respect to observations. These characteristics make HTM particularly suitable for supporting online failure prediction in cloud systems, which are systems with a dynamically changing behavior that must be monitored to anticipate problems. This paper presents the first systematic study that assesses HTM in the context of failure prediction. The results that we obtained considering 72 configurations of HTM applied to 12 different types of faults introduced in the Clearwater cloud system show that HTM can help to predict failures with sufficient effectiveness (F-measure = 0.76), representing an interesting practical alternative to (semi-)supervised algorithms.