Goto

Collaborating Authors

 serverless computing


Advanced Architectures Integrated with Agentic AI for Next-Generation Wireless Networks

Dev, Kapal, Khowaja, Sunder Ali, Zeydan, Engin, Debbah, Merouane

arXiv.org Artificial Intelligence

This paper investigates a range of cutting-edge technologies and architectural innovations aimed at simplifying network operations, reducing operational expenditure (OpEx), and enabling the deployment of new service models. The focus is on (i) Proposing novel, more efficient 6G architectures, with both Control and User planes enabling the seamless expansion of services, while addressing long-term 6G network evolution. (ii) Exploring advanced techniques for constrained artificial intelligence (AI) operations, particularly the design of AI agents for real-time learning, optimizing energy consumption, and the allocation of computational resources. (iii) Identifying technologies and architectures that support the orchestration of backend services using serverless computing models across multiple domains, particularly for vertical industries. (iv) Introducing optically-based, ultra-high-speed, low-latency network architectures, with fast optical switching and real-time control, replacing conventional electronic switching to reduce power consumption by an order of magnitude.


Scalable and Cost-Efficient ML Inference: Parallel Batch Processing with Serverless Functions

Barrak, Amine, Ksontini, Emna

arXiv.org Artificial Intelligence

As data-intensive applications grow, batch processing in limited-resource environments faces scalability and resource management challenges. Serverless computing offers a flexible alternative, enabling dynamic resource allocation and automatic scaling. This paper explores how serverless architectures can make large-scale ML inference tasks faster and cost-effective by decomposing monolithic processes into parallel functions. Through a case study on sentiment analysis using the DistilBERT model and the IMDb dataset, we demonstrate that serverless parallel processing can reduce execution time by over 95% compared to monolithic approaches, at the same cost.


Enabling Efficient Serverless Inference Serving for LLM (Large Language Model) in the Cloud

Ghosh, Himel

arXiv.org Artificial Intelligence

These models, due to their size--often reaching hundreds of gigabytes--and computational requirements, encounter delays due to what is known as the coldstart This review report discusses the cold start latency in problem [22]. This latency arises when serverless serverless inference and existing solutions. It particularly functions, previously idle, initiate, leading to delays reviews the ServerlessLLM method, a system from the loading of extensive LLM checkpoints designed to address the cold-start problem in serverless and GPU resource activation. Such cold starts can inference for large language models (LLMs). Traditional significantly hinder performance in applications requiring serverless approaches struggle with high latency real-time interaction, making solutions to this due to the size of LLM checkpoints and the problem imperative for scalable, serverless LLM deployment.


StraightLine: An End-to-End Resource-Aware Scheduler for Machine Learning Application Requests

Ching, Cheng-Wei, Guan, Boyuan, Xu, Hailu, Hu, Liting

arXiv.org Artificial Intelligence

The life cycle of machine learning (ML) applications consists of two stages: model development and model deployment. However, traditional ML systems (e.g., training-specific or inference-specific systems) focus on one particular stage or phase of the life cycle of ML applications. These systems often aim at optimizing model training or accelerating model inference, and they frequently assume homogeneous infrastructure, which may not always reflect real-world scenarios that include cloud data centers, local servers, containers, and serverless platforms. The key innovation is an empirical dynamic placing algorithm that intelligently places requests based on their unique characteristics (e.g., request frequency, input data size, and data distribution). In contrast to existing ML systems, StraightLine offers end-to-end resource-aware placement, thereby it can significantly reduce response time and failure rate for model deployment when facing different computing resources in the hybrid infrastructure.


Detection of Compromised Functions in a Serverless Cloud Environment

Lavi, Danielle, Brodt, Oleg, Mimran, Dudu, Elovici, Yuval, Shabtai, Asaf

arXiv.org Artificial Intelligence

Serverless computing is an emerging cloud paradigm with serverless functions at its core. While serverless environments enable software developers to focus on developing applications without the need to actively manage the underlying runtime infrastructure, they open the door to a wide variety of security threats that can be challenging to mitigate with existing methods. Existing security solutions do not apply to all serverless architectures, since they require significant modifications to the serverless infrastructure or rely on third-party services for the collection of more detailed data. In this paper, we present an extendable serverless security threat detection model that leverages cloud providers' native monitoring tools to detect anomalous behavior in serverless applications. Our model aims to detect compromised serverless functions by identifying post-exploitation abnormal behavior related to different types of attacks on serverless functions, and therefore, it is a last line of defense. Our approach is not tied to any specific serverless application, is agnostic to the type of threats, and is adaptable through model adjustments. To evaluate our model's performance, we developed a serverless cybersecurity testbed in an AWS cloud environment, which includes two different serverless applications and simulates a variety of attack scenarios that cover the main security threats faced by serverless functions. Our evaluation demonstrates our model's ability to detect all implemented attacks while maintaining a negligible false alarm rate.


AIhub monthly digest: March 2024 – human-robot interaction, serverless computing, and deep reinforcement learning for communication networks

AIHub

Welcome to our monthly digest, where you can catch up with any AIhub stories you may have missed, peruse the latest news, recap recent events, and more. This month, we find out about explainability and human-robot interaction, serverless computing for machine learning, and deep reinforcement learning for communication networks. We also chat to AAAI President Francesca Rossi, and congratulate the ACM/SIGAI Autonomous Agents Research Award winner Catholijn Jonker. "AI used to be a scientific and technical field, now it has become a socio-technical discipline." AIhub ambassador Andrea Rafai caught up with AAAI President Francesca Rossi to ask about her research, regulation of AI, and the UN sustainable development goals: Interview with Francesca Rossi – talking sustainable development goals, AI regulation, and AI ethics.


Interview with Amine Barrak: serverless computing and machine learning

AIHub

The AAAI/SIGAI Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. This year, 30 students were selected for this programme, and we've been hearing from them about their research. In this interview, Amine Barrak, tells us about his work speeding up machine learning by using serverless computing. My focus is on speeding up machine learning by using serverless computing. My research is about finding a way to do machine learning training efficiently in small serverless settings.


A Review of Deep Reinforcement Learning in Serverless Computing: Function Scheduling and Resource Auto-Scaling

Majid, Amjad Yousef, Marin, Eduard

arXiv.org Artificial Intelligence

In the rapidly evolving field of serverless computing, efficient function scheduling and resource scaling are critical for optimizing performance and cost. This paper presents a comprehensive review of the application of Deep Reinforcement Learning (DRL) techniques in these areas. We begin by providing an overview of serverless computing, highlighting its benefits and challenges, with a particular focus on function scheduling and resource scaling. We then delve into the principles of deep reinforcement learning (DRL) and its potential for addressing these challenges. A systematic review of recent studies applying DRL to serverless computing is presented, covering various algorithms, models, and performances. Our analysis reveals that DRL, with its ability to learn and adapt from an environment, shows promising results in improving the efficiency of function scheduling and resource scaling in serverless computing. However, several challenges remain, including the need for more realistic simulation environments, handling of cold starts, and the trade-off between learning time and scheduling performance. We conclude by discussing potential future directions for this research area, emphasizing the need for more robust DRL models, better benchmarking methods, and the exploration of multi-agent reinforcement learning for more complex serverless architectures. This review serves as a valuable resource for researchers and practitioners aiming to understand and advance the application of DRL in serverless computing.


Exploring the Impact of Serverless Computing on Peer To Peer Training Machine Learning

Barrak, Amine, Trabelsi, Ranim, Jaafar, Fehmi, Petrillo, Fabio

arXiv.org Artificial Intelligence

The increasing demand for computational power in big data and machine learning has driven the development of distributed training methodologies. Among these, peer-to-peer (P2P) networks provide advantages such as enhanced scalability and fault tolerance. However, they also encounter challenges related to resource consumption, costs, and communication overhead as the number of participating peers grows. In this paper, we introduce a novel architecture that combines serverless computing with P2P networks for distributed training and present a method for efficient parallel gradient computation under resource constraints. Our findings show a significant enhancement in gradient computation time, with up to a 97.34\% improvement compared to conventional P2P distributed training methods. As for costs, our examination confirmed that the serverless architecture could incur higher expenses, reaching up to 5.4 times more than instance-based architectures. It is essential to consider that these higher costs are associated with marked improvements in computation time, particularly under resource-constrained scenarios. Despite the cost-time trade-off, the serverless approach still holds promise due to its pay-as-you-go model. Utilizing dynamic resource allocation, it enables faster training times and optimized resource utilization, making it a promising candidate for a wide range of machine learning applications.


SPIRT: A Fault-Tolerant and Reliable Peer-to-Peer Serverless ML Training Architecture

Barrak, Amine, Jaziri, Mayssa, Trabelsi, Ranim, Jaafar, Fehmi, Petrillo, Fabio

arXiv.org Artificial Intelligence

The advent of serverless computing has ushered in notable advancements in distributed machine learning, particularly within parameter server-based architectures. Yet, the integration of serverless features within peer-to-peer (P2P) distributed networks remains largely uncharted. In this paper, we introduce SPIRT, a fault-tolerant, reliable, and secure serverless P2P ML training architecture. designed to bridge this existing gap. Capitalizing on the inherent robustness and reliability innate to P2P systems, SPIRT employs RedisAI for in-database operations, leading to an 82\% reduction in the time required for model updates and gradient averaging across a variety of models and batch sizes. This architecture showcases resilience against peer failures and adeptly manages the integration of new peers, thereby highlighting its fault-tolerant characteristics and scalability. Furthermore, SPIRT ensures secure communication between peers, enhancing the reliability of distributed machine learning tasks. Even in the face of Byzantine attacks, the system's robust aggregation algorithms maintain high levels of accuracy. These findings illuminate the promising potential of serverless architectures in P2P distributed machine learning, offering a significant stride towards the development of more efficient, scalable, and resilient applications.