Goto

Collaborating Authors

 serverless infrastructure


Exploring the Impact of Serverless Computing on Peer To Peer Training Machine Learning

arXiv.org Artificial Intelligence

The increasing demand for computational power in big data and machine learning has driven the development of distributed training methodologies. Among these, peer-to-peer (P2P) networks provide advantages such as enhanced scalability and fault tolerance. However, they also encounter challenges related to resource consumption, costs, and communication overhead as the number of participating peers grows. In this paper, we introduce a novel architecture that combines serverless computing with P2P networks for distributed training and present a method for efficient parallel gradient computation under resource constraints. Our findings show a significant enhancement in gradient computation time, with up to a 97.34\% improvement compared to conventional P2P distributed training methods. As for costs, our examination confirmed that the serverless architecture could incur higher expenses, reaching up to 5.4 times more than instance-based architectures. It is essential to consider that these higher costs are associated with marked improvements in computation time, particularly under resource-constrained scenarios. Despite the cost-time trade-off, the serverless approach still holds promise due to its pay-as-you-go model. Utilizing dynamic resource allocation, it enables faster training times and optimized resource utilization, making it a promising candidate for a wide range of machine learning applications.


ETH Zürich & Microsoft Study: Demystifying Serverless ML Training

#artificialintelligence

Serverless computing is a new type of cloud-based computation infrastructure initially developed for web microservices and IoT applications. As it frees model developers from concerns regarding capacity planning, configuration, management, maintenance, operating and scaling of containers, VMs and physical servers, serverless computing has gained popularity with machine learning (ML) researchers in recent years. Moreover, the benefits of serverless computing have also piqued interest in adopting it to data-intensive workloads such as ETL (extract, transform, load), query processing and ML, where it can provide significant cost reductions. Riding this trend, a research team from ETH Zürich and Microsoft recently conducted a systematic, comparative study of distributed ML training over serverless infrastructures (FaaS) and "serverful" infrastructures (IaaS), aiming to identify and understand the system tradeoffs involved in distributed ML training with serverless infrastructures. Serverless computing is offered by major cloud service providers such as AWS Lambda, Azure Functions and Google Cloud Functions.


Efficient Serverless deployment of PyTorch models on Azure

#artificialintelligence

Recent advances in deep learning and cloud-based infrastructure have led to innovations in models for various domains like natural language processing, computer vision, recommendations. Of course, developing the model is only half the story. Your models are mostly useful once they are served up for making predictions for consumption in in AI-driven scenarios from the end applications. It is important to do it in a cost-effective and reliable manner. However, managing infrastructure for hosting your models is challenging as it involves several aspects like maintaining your fleet, ensuring reliability, scaling, security and ongoing monitoring and management.


Deploy any machine learning model serverless in AWS - Ritchie Vink

#artificialintelligence

When a machine learning model goes into production, it is very likely to be idle most of the time. There are a lot of use cases, where a model only needs to run inference when new data is available. If we do have such a use case and we deploy a model on a server, it will eagerly be checking for new data, only to be disappointed for most of its lifetime and meanwhile you pay for the live time of the server. Now the cloud era has arrived, we can deploy a model serverless. Meaning we only pay for the compute we need, and spinning up the resources when we need them.