The science behind SageMaker's cost-saving Debugger
A machine learning training job can seem to be running like a charm, while it's really suffering from problems such as overfitting, exploding model parameters, and vanishing gradients, which can compromise model performance. Historically, spotting such problems during training has required the persistent attention of a machine learning expert. The Amazon SageMaker team has developed a new tool, SageMaker Debugger, that automates this problem-spotting process, saving customers time and money. For example, by using Debugger, one SageMaker customer reduced model size by 45% and the number of GPU operations by 33%, while improving accuracy. Next week, at the Conference on Machine Learning and Systems (MLSys), we will present a paper that describes the technology behind SageMaker Debugger.
Apr-8-2021, 21:06:52 GMT
- Technology: