Introducing AdaptDL, an Open Source resource adaptive deep-learning framework
Petuum is very excited to announce the launch of our newest open source offering, AdaptDL, a resource-adaptive deep learning (DL) training and scheduling framework. The goal of AdaptDL is to make distributed DL easy and efficient in dynamic-resource environments such as shared clusters and the cloud. During our benchmark studies when using AdaptDL with Amazon Web Services (AWS), we recorded a reduction in cost by up to 80% when AdaptDL was set to automatically provision spot instances on AWS when available. AdaptDL can automatically determine the optimal number of resources given a job's need. It will efficiently add or remove resources dynamically to ensure the highest-level performance.
Sep-2-2020, 23:45:29 GMT